Root Cause Statements Are Opinions

If two safety managers independently of each other conducted a root cause analysis of the same occurrence, there would most likely be two different root cause statements. 

One method used to arrive at a coherent root cause statement is to assign the root cause to the pilot in command. When multiple root cause analyses are conducted by different persons or organizations of identical hazards, or incidents, a root cause that is allocated, or assigned, to a person is a predetermined root cause. Justification for assigning a root cause to a vehicle operator, a pilot, or maintenance person is often that they failed to follow standard operating procedures (SOP). Standard operating procedures are tasks required to be completed by a person, and when items on the SOP are missed by a person, the root cause is assigned to that single missed item. When a root cause analysis leads to a reference document, other factors are excluded with no other options available. A reliable root cause analysis leads to one of four factors, which are human factors, organizational factors, supervision factors or environmental factors, and is within the scope and responsibility of the root cause analysis person, department, or organization. 

The 5-Why analysis is a frequently used as a root cause analysis tool. A flaw with the simple version of the analysis, is that after the first Why-question is asked, the path is laid out for what direction the root cause with take. A more reliable root cause analysis is to use a 5x5 matrix and ask the Why-question 25 times before conclude with a root cause statement. When the root cause has been established, assign one of the one of the four factors as the primary root cause factors. 

A root cause analysis is to track the process backwards to the fork-in-the-road where a different action most likely would have changed the outcome. A root cause analysis must be based on the fact that the future was unknown at the time of the occurrence. When conducting an analysis of an incident, the outcome is already known, which add an additional challenge to define an unbiased root cause statement. Other challenges are to overcome pressure from within the organization and public opinion to immediately deliver a root cause statement, and to deliver a root cause of what is beyond operational control, or scope of data collection sources. A user or client may also demand that an airport or airline publish a root cause statement of a third-party contractor incident.   

After an aircraft incident, the first Why-question is: Why did the aircraft crash? The answer could be several, but the least challenging answer is that the pilot failed to remain on the runway. This answer already points the finger at the pilot as being the root cause. Since the analysis is named the 5-Why, let’s keep asking more questions. The answer to the first question becomes the question for the second answer. Why did the pilot fail to remain on the runway? Because the pilot failed to apply proper control inputs. Next Why-question is: Why did the pilot fail to apply proper control inputs? Answer number 3 is because the pilot did not recognize the crosswind when landing. Why did the pilot not recognize the crosswind? Because the pilot did not calculate the crosswind component prior to landing. The last question, and question number five which is the root cause is: Why did the pilot not calculate the crosswind component? Answer to number five and the root cause is that the pilot failed to use the approach and landing checklist requirement to check the crosswind component when with wind velocity is greater than 20 KTS.

This root cause analysis identifies many things that the pilot did not do. What is lacking in the analysis is what the pilot did. During the times when a pilot failed to process a specific task, the pilot did something else. Time does not evaporate, and when a required task failed, that time period was filled with something else that the pilot did. When assigning blame or a root cause to pilot failure, valuable information of what actually happened are missed. What did not happen is irrelevant to the root cause analysis. There are many things the pilot failed to do or did not do. Some of these could be that the pilot failed acknowledge landing clearance, the pilot failed to set bug-speed, the pilot filed to put on shoulder harness, the pilot failed to touchdown in the landing area, or the pilot failed to check that the parking brakes were off prior to touchdown and more. That a pilot failed to complete a required task does not necessarily cause an accident. On the other hand, that a pilot does something may cause an accident. A contributing cause to the Everglades crash in 1972 was that the crew did their best to fix a failed light. They filled their time with other tasks than aircraft control tasks, which is an important fact to determine the root cause. The factor causing the airplane to crash outside of runway pavement was that the cross wind within seconds increased from 21KTS to 48KTS during the landing and the pilot encountered an extreme downdraft in the overshoot. 

Human factors and human errors are in the aviation industry accepted as being the same thing. However, human error is a symptom of trouble deeper inside a system or an organization. On the other hand, human error is also a symptom of a successful organization. There are organizations where human errors are integrated with the system and need to be there for the organization to exist and prosper. It is the system itself that is set up for human errors.  

Conventional wisdom is that human error is a ”bad” thing when using emotions to describe an event.  Human error is a sub-category of human factors. Simplified, human factors are how a person react when one or more of the five senses, vision, hearing, smell, taste, and touch are triggered. Human factors are also how external forces, or events, e.g., fatigue, weather, illumination and more, affect performance. 

In an organization where there are overwhelming events of human errors, the organization operates within a system that is prone to these errors. Tow examples are car races or air races, where the systems (race to win) are setting each driver and pilot up for human error, or a crash. Both a car race and an air race organizers have requirements and systems in place to reduce harm to drivers, pilots, or spectators. 

These systems are designed for human errors. Imagine how successful a car race would be if the speed was limited to 50MPH, or if an air race required airplanes to fly between gates separated a mile apart.
  
For a corrective action to be applied to a root cause, an operator must understand the finding, and the associated hazard. System level findings identify the system and regulatory requirement that failed, and a process level findings identify the process that did not function as expected. To develop an effective corrective action plan (CAP), an SMS enterprise must comprehend the nature of the system or process deficiency which led up to the error. 

System level findings group non-compliances that show a system-wide deficiency of an SMS enterprise system. Examples of systems are the safety management system, a quality assurance system, operational control system, maintenance control system, training system, or airport preventive maintenance system.

Process level findings identify processes that did not function and resulted in an incident or regulatory non-compliance. Examples of processes are documentation control process, safety risk management process, system analysis process, audit process, or an airport emergency response plan exercise process. It is vital for the success of a corrective action plan that it is applied to the correct system or process, and that it is applied to a system or process as applicable to the finding. 

The purpose of a factual review of a finding is to define the scope of the problem in the system. An SMS enterprise must clearly Identify policies, processes, procedures, and acceptable work practices involved that allowed the non-compliance to happen. Processes and procedures are usually established through documentation, but also consider undocumented work practices, attitudes and tolerances that may have developed and drifted over time. An SMS enterprise attacks the finding by defining the problem and make a clear statement of how widespread the non-compliance is in the system. A finding could be isolated to one area of the organization or spread into other departments and functional areas of an SMS enterprise. When explaining how widespread the problem is, write it in clear text with clarification and directions. Problems are addressed in 3D measured in time (speed), space (location), and compass (direction). After the hazard, or problem has been assigned a corrective action plan, the plan is put into action immediately. 
  
A system failure could be that the electrical grid to an airport failed. It is beyond the scope and responsibility of an airport to conduct a root cause analysis of the grid system, since airport operators does not control, maintain, or monitor the power grid. The root cause analysis is the responsibility of the power grid operator. An airport operator would conduct a root cause analysis to establish why their backup generator also failed when the power grid failed. Both root causes are allocated to either human factors, organizational factors, supervision factors or environmental factors. Root cause allocation becomes opinions of what system the corrective action plan should be applied to. 

All systems include multiple sub-systems. A vehicle system is to move persons, goods, or services from one point to another. Sub-systems of a vehicle are the transmission system, tire system, control system and more. Should a transmission system fail, is not a failure of the entire vehicle system, but is simply a transmission system failure. When the transmission fails, the task becomes to assign a root cause opinion why it failed, implement a CAP, test, and monitor if the repair was successful. 

It is vital for a successful safety management system that accountable executives accept the fact that root cause statements are opinions only. 

 

OffRoadPilots


Comments

Popular posts from this blog

Accepting or Rejecting Risks

Lawless

Human Factors