Abstract: We have all heard the term Root Cause Analysis (RCA) and we likely all interpret its meaning in a different fashion. This is the single most reason we see for the ineffective use of RCA, lack of communication or miscommunication amongst the users. If we are all using various forms of RCA, then when we compare our results we are not comparing apples with apples. We will explore the discipline necessary to provide consistency to our RCA application thus quantumly improving the credibility and communication of the results.
Since the evolution of Total Productive Maintenance (TPM) in the United States there has been a consistent movement towards exploring the quality of the process versus the quality of the product. Before the advent of TPM, quality organizations were typically content with testing the quality of the product as it came off the line as a finished product. While an admirable concept at the time, we learned by that time it was too late if we found quality defects. The entire product and/or lot would have to be reworked at great expense to the organization.
Then the TPM concepts of W. Edwards Deming were introduced and they pushed the “quality of process” concept. In short, this meant that we would measure key variables within the process stages and monitor for any unacceptable variances. In this manner, we can correct the process variation and prevent the production of off-spec products. This era has continued into the 21st century with the introduction of Six Sigma.
Now take the above summaries of the application of TPM and let's apply them to a non-manufacturing process such as RCA. As we discussed earlier, RCA means different things to different people. Many consider undisciplined efforts such as “trial and error” as their RCA approach. This means that we perceive a problem to exist and we go right to what the most obvious cause is TO US! This is the “finished product” approach. We do not validate any of our assumptions, we just assume a cause and spend money to implement a fix and hope it works. Experience shows this approach to be ineffective and very expensive.
Now let's apply the TPM concept to a disciplined method of RCA such as a Logic Tree used in the PROACT® process. A Logic Tree strives to graphically represent the cause and effect relationships that lead to the surfacing of the undesirable event.
In this approach, we must clearly identify the undesirable event and its associated modes with supporting facts. Facts are supported by some essence of science, direct observation and documentation. They cannot be hearsay or assumptions!
For instance below, most people would insist that we start with a bearing failure. However, when the event occurred, why was it brought to our attention? It was not brought to our attention because the bearing failed. It was brought to our attention because the failed bearing caused a pump to stop pumping something. Therefore the last effect that caused us concern was the pump failure. One reason (or mode) that the pump failed was due to the bearing failing. This is clearly evidenced by the failed bearing (physical evidence). The top of the Logic Tree may look like this:
Fact – DCS Verification
Fact – Physical Bearing
Figure 1.0 – Event and Mode Supported by Facts
Continuing our search backwards for the cause and effect relationships, we would then ask “How can a bearing fail?”. Our hypotheses may be; erosion, corrosion, fatigue or overload. How can we prove which are true? We would simply have a metallurgical lab analyze the failed bearing for us and produce an analysis report.