Root Cause

The origin of an occurrence travels through multiple stages until it is analysed as a root cause. When it comes to aviation safety, prevention of accidents and the Safety Management System (SMS), conventional wisdom is that there could be multiple root causes causing an occurrence. There might be multiple root causes, but there is only one primary root cause breaking away, leading the way to define the scope of the root cause analysis. The fist step in a root cause analysis is not to learn why an occurrence happened or why a latent hazard became an issue, but it is to assign the scope of the analysis to multiple root cause factors. One reason for assigning predetermined root cause factors is to work within a structured analysis system. SMS is also a businesslike approach to safety. The aviation industry put a safety management system in place as an extra layer of protection for incremental safety improvements. When conducting a root cause analysis outside of a structured system, the analysis is without directional control. When working outside of a structured systems, opportunities and failures are allowed to be introduced in the process to follow the path of least resistance with a guaranteed failure of a root cause analysis. 

A root cause analysis needs to be analysed in a 3D system measured in time (speed), space (location), and compass (direction) and within the scope of human factors, organizational factors, supervision factors and environmental factors. A 3D analysis system places the reversal of event process in a true environment of events. However, assigning and implementing changes to operations based on a root cause analysis is not a guarantee that same or similar occurrences are eliminated in the future. This is a fundamental principle of an SMS and published by ICAO that “Safety is not risk free.” An SMS regulation states that an SMS Enterprise needs a process for the internal reporting and analyzing of hazards, incidents, and accidents and for taking corrective actions to prevent their recurrence. Conforming to this regulation does not guarantee elimination of future occurrences, but a corrective action under the control of the enterprise that could have prevented the non-compliance. The purpose of a root cause analysis is to predict with a 95% confidence level the probability for a successful outcome without an unscheduled event. There are several more contributing factors beyond the control of an operator than there are factors under their control. 

It is crucial for the successful application of a root cause to know what a root cause is not. A root cause analysis is not perfect, it is not the magic wand of miracles for accidents never to happen again. A root cause is not a system where prescriptive expectations are applied as regulations. A root cause statement is not a one-fit-all model, and a root cause is not a model where everything is grouped. A root cause analysis is not about emotions, wishes or dreams, but is an imperfect system applied to proactive processes. Working with an imperfect system opens millions of doors of opportunities for improvements, while a perfect system is ridged without justifications to be changed. We all know the saying “If it ain’t broke don’t fix it.”  

A safety management system is about human behaviors and how external events affect internal emotions and human behaviors. This makes a root cause analysis different from a root cause analysis of mechanical or tangible items. A root cause analysis of material strength only needs one special cause variation, or one failure, to conduct a root cause analysis of its system. Material is reliable and when produced the same way will provide the same output. Human factors are different, that the same input, such as training and learning, does not provide the same operational output between different people. 

A Non-Destructive Testing system (NDT) is a system to detect flaws within a material or on its surface, and to established if production process produces flaws or failures. There are different independent systems within an NDT system and none of these systems are compatible to interact with the other systems. Some frequently used NDT inspection process are X-ray, ultrasound, magnetic particle, fluorescent penetrant, or acid inspections. The system of X-ray inspection is applied to inspect for flaws within a material to relatively fine and defined resolutions.  Ultrasound is also applied to inspect for flaws within a material, but to a relatively course and undefined resolutions. Magnetic particle inspection is applied to both internal and external material flaws discovery. NDT inspection system is applied to external inspection of flaws is the fluorescent penetrant inspection. Acid inspection is a surface inspections of material temperature variations. Within an NDT system all these independent systems function to produce an outcome of an effective system that will function as it was designed to function. None of these methods of NDT inspections are inferior to one or the other, they are just a part of one total system to manage, or lead processes to produce a flawless output.

In the same way as an NDT system defines the scope of its intended inspection, and the scope of a root cause analysis after a failure discovery, a root cause analysis within a safety management system must also define its scope and root cause analysis factor. In a material failure root cause analysis, the scope is predefined and could be of the mixtures, the oven temperatures, the vacuum chamber, the manufacturing process or the assembly process. Without defining the scope, a root cause is only an opinion of the 5-Ws and How. A root cause analysis within an SMS Enterprise establishes human factors, organizational factors, supervision factors and environmental factors as their primary scope of analysis. Several other factors could be added, such as mechanical factors, electronic factors, material factors, economical factors, ergometric factors and more.  

Assume for a moment that there was a flaw in a compressor disk bult for extreme high RPM. An undetected microscopic flaw could cause a major destruction to the compressor itself and equipment it was powering. When a flaw or material failure is discovered the scope of the root cause must first be decided on. The root cause could be of human factors, inspection processing factors, material composite factors or manufacturing factors. Each factor may have contributed to the flaw, but only one factor would be the primary root cause for a corrective action plan. 

A root cause analysis within an SM Enterprise is prone to pre-analysis conclusions or jumping to conclusions without first determining the scope of analysis. When a root cause analysis is assigned to a responsible person, the first step is to ask the 5-Why root cause analysis question. When the cause is predetermined, the first Why-questions demands a trail that leads to a predetermined answer. A root cause analysis outcome may be affected by intimidation, or high-level management demanding root cause to be identified as human errors. Should an SMS manager oppose their demand to jump to the human error conclusion, senior managers may become verbally abusive and feeling ignored, that their opinions are not important, and find it shocking that their SMS manager is running a program that nobody have control over. This is a virtual scenario, but with a probable likelihood to occur. A root cause analysis needs to first establish the scope to remain neutral. 

The first purpose of a root cause analysis is to identify system level findings non-compliances that show a system-wide deficiency of an enterprise system. Examples of system findings are safety management system, quality assurance program, operational control system, maintenance control system, or a training program system. 

The second purpose of a root cause analysis is to identify process level findings of an enterprise process which did not function and resulted in non-scheduled output. Examples of processes applicable in various aviation industry sectors include, but could be documentation control process, safety risk management process, internal audit process, or emergency response testing processes. 

When a root cause analysis has established its scope and purpose, corrective action assigned has an opportunity to successfully prevent further occurrences. 


OffRoadPilots



Comments

Popular posts from this blog

Accepting or Rejecting Risks

Lawless

Human Factors