Trinath Sahoo

Root Cause Failure Analysis


Скачать книгу

it occurred

       Why it occurred

       Actions for averting reoccurrence that can be developed and implemented

      The root cause analysis process – RCA has five identifiable steps.

      1 Define the problem

      2 Collect data

      3 Identify possible causal factors

      4 Identify the root cause

      5 Recommend and implement solution

      One of the important steps in root cause failure analysis (RCFA) is to define a problem. Effective and event descriptions are helpful to ensure the execution of appropriate root cause analyses. The first step to define the problem is by asking the four questions:

       What is the problem?

       When did it happen?

       Where did it happen? and

       How did it impact the goals?

      The investigator or the RCA analyst seldom present when an incident or failure occurs. Therefore, the first information report or FIR is the initial notification that an incident or failure has taken place. In most cases, the communication will not contain a complete description of the problem. Rather, it will be a very brief description of the perceived symptoms observed by the person reporting the problem.

      It involves failure reporting regarding incident which includes details of failure time, place, nature of failure, and failure impacts on organization.

      Consider a problem on a centrifugal pump AC Motor. A typical problem report could state “pump ABC motor has a problem”. Even though this type of problem reporting could be worse, for example, “fan is bad” or “shrill noise from one of the pumps.” “Pump ABC Motor has a problem” it is still not a very good definition.

      A better definition may be “AC Motor of pump ABC” is hot. Can we do better with some basic Root Cause Analysis steps? Sure! Let’s ask the traditional, WHAT, WHERE, WHEN, EXTENT. The problem is:

      The above definition is usually enough to get a problem started. Is it ideal? Perhaps not, but it’s pretty good for a problem statement. This level of problem reporting for craftspeople and operators would be a huge improvement for most plants in improving day‐to‐day Root Cause Analysis.

      Data collection is the second and important phase of RCA process. Acquiring, gathering, or collecting the failure data regarding the incident are a key for getting the valuable results of RCA investigation. Comprehensive and relevant failure data are crucial to identify and understand the root causes of a failure accurately. Unavailability of correct, adequate, and sufficient data can lead to undesired results of RCA.

      It is important to collect data immediately after occurrence of failure for accurate information and evidence collection before the data is lost. The information that should be collected consists of personnel involved; conditions before, during, and after the event; environmental factors; and other information required for root cause analysis process.

      Every effort should be made to preserve physical evidence such as failed components, ruptured gaskets, burned leads, blown fuses, spilled fluids, partially completed work orders, and procedures. Event participants and other knowledgeable individuals should be identified. All work orders and procedures must be preserved and effort should be made to preserve physical evidence such as failed components and ruptured gaskets. After the data associated with the event have been collected, the data should be verified to ensure accuracy.

      Data for any failure could include the previous failure reports, maintenance, and operations data, process data, drawings, design, physical evidences, failed part of equipment and any other necessary information related to the particular failure. It is not necessary that every failure required comprehensive data but sometimes data could be missing and gathered data is not sufficient to identify actual causes of the failure. So it is necessary that collected data must be accurate and relevant. Failure can’t be investigated properly without availability of correct and related data. Usually, data collection consumes more time as compare to other steps of RCA process so data must be precise and meaningful for identifying the exact causes of failure. Information collected from gathered data is significant for making recommendation and conclusions.

      When investigating an incident involving equipment failure, the first job is to preserve the physical evidence. The instrumentation and control settings and the actual reading before the failure happen should be fully documented for the investigating team. In addition, the operating and process data, approved standard operating (SOP) and standard maintenance procedure (SMP), Copies of log books, work packages, work orders, work permits, and maintenance records; eq should be preserved.

       Some methods of gathering information include:

       Conducting interviews/collecting statements – Interviews must be fact finding and not fault finding. Preparing questions before the interview is essential to ensure that all necessary information is obtained.

       Interviews should be conducted, preferably in person, with those people who are most familiar with the problem. Although preparing for the interview is important, it should not delay prompt contact with participants and witnesses. The first interview may consist solely of hearing their narrative. A second, more‐detailed interview can be arranged, if needed. The interviewer should always consider the interviewee’s objectivity and frame of reference.

       Reviewing records: Review of relevant documents or portions of documents and reference their use in support of the root cause analysis.

       Acquiring related information: Some additional information that an evaluator should consider when analyzing the causes include:

       Evaluating the need for laboratory tests, such as destructive/nondestructive failure analysis.

       Viewing physical layout of system, component, or work area; developing layout sketches of the area; and taking photographs to better understand the condition.

       Determining if operating experience information exists for similar events at other facilities.

       Reviewing equipment supplier and manufacturer records to determine whether correspondence has been received addressing this problem.

      Interviews

      For critical incidents, all key personnel involved must be interviewed to get a complete picture of the incident. Individuals having direct or indirect knowledge that could help clarify the case should also be interviewed.

      Questions to Ask

       What happened?

       Where did it happen?

       When did it happen?

       What changed?

       Who was involved?

       Why did it happen?

       What is the impact?

       How can recurrence be prevented?

      The sequence of event helps in finding out which cause has first triggered the incident. This helps in organizing the information and establishes relationship between the event and incident.