Open access

Developing Risk Models for Aviation Inspection and Maintenance Tasks

Written By

Lee T. Ostrom and Cheryl A. Wilhelmsen

Submitted: 11 November 2010 Published: 12 September 2011

DOI: 10.5772/21156

From the Edited Volume

Aeronautics and Astronautics

Edited by Max Mulder

Chapter metrics overview

5,646 Chapter Downloads

View Full Metrics

1. Introduction

Risk assessment has been used to analyze a wide range of industries to determine vulnerabilities with the ultimate purpose of eliminating the sources of risk or reducing them to a reasonable level. The purpose of this chapter is to show how risk assessment tools can be used to develop risk models of aviation maintenance tasks. Two tools will be discussed in this chapter, though many other methods exist. The tools discussed in this chapter are:

  • Failure Mode and Effect Analysis (FMEA)

  • Event and Fault Tree Analysis

Ostrom and Wilhelmsen (2011) discuss a wide range of risk assessment tools and this book provides many examples of how these tools are used to analyze various industries.

Advertisement

2. Failure mode and effect analysis

An FMEA is a detailed document that identifies ways in which a process or product can fail to meet critical requirements. It is a living document that lists all the possible causes of failure from which a list of items can be generated to determine types of controls or where changes in the procedures should be made to reduce or mitigate risk. The FMEA also allows procedure developers to prioritize and track procedure changes (Mil Std 882B, C, 1984 and 1993). The process is effective because it provides a very systematic process for evaluating a system or a procedure, in this instance. It provides a means for identifying and documenting:

  1. Potential areas of failure in process, system, component, or procedure.

  2. Potential effects of the process, system, component, or procedure failing.

  3. Potential failure causes.

  4. Methods of reducing the probability of failure.

  5. Methods of improving the means of detecting the causes of failure.

  6. Risk ranking of failures, allowing risk informed decisions by those responsible.

  7. A starting point from which the control plan can be created.

FMEA can be used to analyze:

  1. Process: Documents and addresses failure modes associated with the manufacturing and assembly process.

  2. Procedure: Documents and addresses failure points and modes in procedures.

  3. Software: Documents and addresses failure modes associated with software functions.

  4. Design: Documents and addresses failure modes of products and components long before they are manufactured and should always be completed well in advance of prototype build.

  5. System: Documents and addresses failure modes for system and subsystem level functions early in the product concept stage.

  6. Project: Documents and addresses failures that could happen during a major program.

A procedure analysis will be used to demonstrate how an FMEA can be conducted. An FMEA is conducted on a step-by-step basis. Table 1 shows an example of an FMEA table. The following constitutes the steps of an FMEA. These steps will be illustrated by use of an example.

Item Potential Failure Mode Cause of Failure Possible Effects Probability Criticality (Optional) Prevention
Step in procedure, part, or component How it can fail:
–pump not working
–stuck valve
–no money in a checking account
–broken wire
–software error
–system down
–reactor melting down
What caused the failure:
Broken part
Electrical failure
Human error
Explosion
Bug in software
Outcome of the failures:
Nothing
System crash
Explosion
Fire
Accident
Environmental release
How possible is it:
Can use numeric values:
0.1, 0.01, or
1E-5
Can use a qualitative measure:
Negligible, low probability, high probability.
How bad are the results:
Can use dollar value:
$10., $1,000., or $1,000,000
Can use a qualitative measure:
Nil, Minimal problems, major problems.
What can be done to prevent either failures or results of the failures?

Table 1.

Example FMEA Table

The first step is to create a flow diagram of the procedure. This is a relatively simple process in which a table or block diagram is constructed that shows the steps in the procedure. Table 2 shows the simple steps checking an engine chip detector. Note that this is a simple example and not an exhaustive analysis. Table 3 lists the major, credible failures associated with each step in the process. Table 4 shows the effect of the potential failures. Table 5 shows the complete FMEA for the task.

Table 2.

Process Steps for checking a chip detector

FMEA is a relatively simple, but powerful tool and has a wide range of applicability for analyzing aircraft maintenance tasks.

Advertisement

3. Event tree and fault tree analysis

An event tree is a graphical representation of a series of possible events in an accident sequence (Vesely, William; et. al., 2002). Using this approach assumes that as each event occurs there are only two outcomes, failure or success. A success ends the accident sequence and the postulated outcome is either that the accident sequence terminated successfully or was mitigated successfully. For instance, a fire starts in an engine. This is the initiating event. Then the automated system closes fuel feed. If the lack of fuel does not extinguish the fire, the next step is that that the fire suppression system is challenged. If the system actuates the fire suppression system the fire is suppressed and the event sequence ends. If the fire suppression system fails the fire is not suppressed then the accident sequence progresses. Table 6 shows this postulated accident sequence. Figure 1 shows this accident sequence in an event tree.

As in most of the risk assessment techniques, probabilities can be assigned to the events and combined using the appropriate Boolean Logic to develop an overall probability for the various paths in the event. Using our example from above, we will now add probabilities to the events and show how the probabilities combine for each path. Figure 2 shows the addition of path probability to the event tree.

Inspecting Chip Detector
Process Steps Major Failures
Cut and Remove Lock Wire from Oil Drain Plug No major failures that affect process outcome
Remove Oil Drain Plug No major failures that affect process outcome
Drain Oil No major failures that affect process outcome
Cut and Remove Lock Wire from Chip Detector No major failures that affect process outcome
Remove Chip Detector Improper removal can remove debris from chip detector and cause false reading. Chip detector can be damaged if improperly removed.
Examine Chip Detector Aircraft Maintenance Technician (AMT) fails to notice debris on chip detector.
Clean Chip Detector AMT fails to properly clean chip detector
Replace Chip Detector AMT fails to properly install chip detector
Lock Wire Chip Detector AMT fails to properly lock wire chip detector
Replace Oil Drain Plug AMT fails to properly install oil drain plug
Lock Wire Oil Drain Plug AMT fails to properly lock oil drain plug
Replace Oil AMT fails to properly replace oil

Table 3.

Failures Associated with Each Step

Inspecting Chip Detector
Process Steps Potential Failure Modes Potential Failure Effects
Remove Chip Detector Improper removal can remove debris from chip detector and cause false reading. Chip detector can be damaged if improperly removed. Engine could fail if chips are not properly detected.

Added cost to replace damaged chip detector.
Examine Chip Detector Aircraft Maintenance Technician (AMT) fails to notice debris on chip detector.
Engine could fail if chips are not properly detected.
Clean Chip Detector AMT fails to properly clean chip detector Debris could be placed back into engine.
Replace Chip Detector AMT fails to properly install chip detector Oil could leak past chip detector.
Threads of chip detector could be damaged.
Lock Wire Chip Detector AMT fails to properly lock wire chip detector Chip detector could become lose and fall out, leading to loss of engine oil.
Replace Oil Drain Plug AMT fails to properly install oil drain plug Engine oil could leak out.
Oil drain plug could become damaged.
Lock Wire Oil Drain Plug AMT fails to properly lock oil drain plug Oil drain plug could become loose and fall out.
Oil drain plug could become damaged.
Replace Oil AMT fails to properly replace oil Engine could fail.

Table 4.

Effect of Potential Failures

Procedure Step Potential Failure Mode Cause of Failure Possible Effects Probability Criticality Prevention
Cut and Remove Lock Wire from Oil Drain Plug No major failures that affect process outcome AMT Fails to Perform Task Delay in performing task. Very Low Not Critical Ensure AMTs follow work schedule
Remove Oil Drain Plug No major failures that affect process outcome AMT Fails to Perform Task Delay in performing task. Very Low Not Critical Ensure AMTs follow work schedule
Drain Oil No major failures that affect process outcome AMT Fails to Perform Task Delay in performing task. Very Low Not Critical Ensure AMTs follow work schedule
Cut and Remove Lock Wire from Chip Detector No major failures that affect process outcome AMT Fails to Perform Task Delay in performing task. Very Low Not Critical Ensure AMTs follow work schedule
Examine Chip Detector AMT fails to notice debris on chip detector. AMT Fails to Properly Perform Task Engine could fail if chips are not properly detected.

Added cost to replace damaged chip detector.
Moderate Critical Training, procedures, and inspection oversight
Clean Chip Detector AMT fails to properly clean chip detector AMT Fails to Properly Perform Task Engine could fail if chips are not properly detected. Moderate Critical Training, procedures, and inspection oversight
Replace Chip Detector AMT fails to properly install chip detector AMT Fails to Properly Perform Task Debris could be placed back into engine. Moderate Critical Training, procedures, and inspection oversight
Lock Wire Chip Detector AMT fails to properly lock wire chip detector AMT Fails to Properly Perform Task Oil could leak past chip detector.
Threads of chip detector could be damaged.
Moderate Critical Training, procedures, and inspection oversight
Replace Oil Drain Plug AMT fails to properly install oil drain plug AMT Fails to Properly Perform Task Chip detector could become lose and fall out, leading to loss of engine oil. Moderate Critical Training, procedures, and inspection oversight
Lock Wire Oil Drain Plug AMT fails to properly lock oil drain plug AMT Fails to Properly Perform Task Engine oil could leak out.
Oil drain plug could become damaged.
Moderate Critical Training, procedures, and inspection oversight
Replace Oil AMT fails to properly replace oil AMT Fails to Properly Perform Task Oil drain plug could become loose and fall out.
Oil drain plug could become damaged.
Low Critical Training, procedures, and inspection oversight
Engine could fail.

Table 5.

Complete FMEA for Chip Detector Task

Figure 1.

Event Tree

Table 6.

Accident Sequence

Table 7.

Event Sequence with Probabilities

This result of this analysis tells us that the probability derived for a fire in which the fuel feed system stops fuel supply to engine actuates and the consequence in minimal damage is approximately 1/1000 or 1X10-3. The probability derived for a fire in which the fuel feed system fails to actuate, but the fire suppression system successfully extinguishes the fire and there is only moderate damage is 1E-6 or 1X10-6. Finally, the probability that a fire occurs and both the fuel feed system fails and fire suppression system fails and severe damage occurs is 1E-8 or 5X10-8.

Figure 2.

Event Tree with Path Probabilities

This approach is considered inductive in nature. Meaning the system uses forward logic. A fault tree, discussed below, is considered deductive because usually the analyst starts at the top event and works down to the initiating event. In complex risk analyses event trees are used to describe the major events in the accident sequence and each event can then be further analyzed using a technique most likely being a fault tree (Modarres, M., 2006).

As indicated, the fault tree begins at the end, so to speak. This top-down approach starts by supposing that an accident takes place (Vesely, William; et. al., 2002). It then considers the possible direct causes that could lead to this accident. Next it looks for the origins of these causes. Finally it looks for ways to avoid these origins and causes. The resulting diagram resembles a tree, thus the name.

Fault trees can also be used to model success paths as well. In this regard they are modeled with the success at the top and the basic events are the entry level success that put the system on the path to success.

The goal of fault tree construction is to model the system conditions that can result in the undesired event. Before construction of a fault tree, the analyst must acquire a thorough understanding of the system. A system description should be part of the analysis. The analysis must be bounded, both spatially and temporally, in order to define a beginning and endpoint for the analysis. The fault tree is a model that graphically and logically represents the various combinations of possible events, both fault and normal, occurring in a system leading to the top event. The term “event” denotes a dynamic change of state that occurs to a system element. System elements include hardware, software, human, and environmental factors (Vesely, William; et. al. 2002).

Table 8 shows the most common fault tree symbols. These symbols represent specific types of fault and normal events in fault tree analysis. In many simple trees only the Basic Event, Undeveloped Event and Output Event are used.

Table 8.

Common Fault Tree Symbols

Events representing failures of equipment or humans (components) can be divided into failures and faults. A component failure is a malfunction that requires the component to be repaired before it can successfully function again. For example, when a turbine blade in an engine breaks, it is classified as a component failure. A component fault is a malfunction that will “heal” itself once the condition causing the malfunction is corrected. An example of a component fault is a switch whose contacts fail to operate because they are wet. Once they are dried, they will operate properly.

Output events include the top event, or ultimate outcome, and intermediate events, usually groupings of events. Basic events are used at the ends of branches since they are events that cannot be further analyzed. A basic event cannot be broken down without losing its identity. The undeveloped event is also used only at the ends of event branches. The undeveloped event represents an event that is not further analyzed either because there is insufficient data to analyze or because it has no importance to the analysis.

Logic gates are used to connect events. The two fundamental gates are the “AND” and “OR” gates. Table 9 describes the gate functions and also provides insight to their applicability.

There are four steps to performing a Fault Tree Analysis:

  1. Defining the problem

  2. Constructing the fault tree

  3. Analyzing the fault tree qualitatively

  4. Documenting the results

Table 9.

Logic Gates

A top event and boundary conditions must be determined when defining the problem. Boundary conditions include:

  • System physical boundaries

  • Level of resolution

  • Initial Conditions

  • Not allowed events

  • Existing Conditions

  • Other Assumptions

Top events should be precisely defined for the system being evaluated. A poorly defined top event can lead to an inefficient analysis.

Construction begins at the top event and continues, level by level, until all fault events have been broken into their basic events. Several basic rules have been developed to promote consistency and completeness in the fault tree construction process. These rules, as listed in Table 10, are used to ensure systematic fault tree construction (American Institute of Chemical Engineers, 1992).

Table 10.

Rules for Constructing Fault Trees

Many times it is difficult to identify all of the possible combinations of failures that may lead to an accident by directly looking at the fault tree. One method for determining these failure paths is the development of “minimal cut sets.” Minimal cut sets are all of the combinations of failures that can result in the top event. The cut sets are useful for ranking the ways the accident may occur and are useful for quantifying the events, if the data is available. Large fault trees require computer analysis to derive the minimal cut sets, but some basic steps can be applied for simpler fault trees:

  1. Uniquely identify all gates and events in the fault tree.

  2. If a basic event appears more than once, it must be labeled with the same identifier each time. Resolve all gates into basic events.

  3. Gates are resolved by placing them in a matrix with their events.

  4. Remove duplicate events within each set of basic events identified.

  5. Delete all supersets that appear in the sets of basic events.

By evaluating the minimal cut sets, an analyst may efficiently evaluate areas for improved system safety. The analyst should provide a description of the system analyzed, a well as a discussion of the problem definition, a list of the assumptions, the fault tree model(s), lists of minimal cut sets, and an evaluation of the significance of the minimal cut sets. Any recommendations should also be presented. An example fault tree for the engine fire example is shown in Figure 3.

Figure 3.

Example Fault Tree

Advertisement

4. Summary

This chapter discussed how common risk assessment techniques could be used to perform risk assessments of aviation related activities. As discussed in the very beginning paragraph of this chapter, Ostrom and Wilhelmsen (2011) discuss in depth how to use risk assessment techniques to analyze a wide variety of systems, tasks, and activities.

References

  1. 1. American Institute of Chemical Engineers., 1992 Guidelines for Hazard Evaluation Procedures, New York.
  2. 2. Mil Std 882B, C,1984 and 1993
  3. 3. Modarres M. 2006Risk Analysis in Engineering: Techniques, Tools, and Trends, CRC Press; 1 edition, 1-57444-794-7
  4. 4. Ostrom L. Wilhelmsen C. Summer . 2011Risk Assessment Tools and techniques and Their Application, in Process.
  5. 5. Vesely William. et al. 2002pdf). Fault Tree Handbook with Aerospace Applications. National Aeronautics and Space Administration. http://www.hq.nasa.gov/office/codeq/doctree/fthb.pdf.Retrieved 2010-01-17.

Written By

Lee T. Ostrom and Cheryl A. Wilhelmsen

Submitted: 11 November 2010 Published: 12 September 2011