Open access peer-reviewed chapter

Failure Modes and Effects Analysis

Written By

Jaroslav Menčík

Submitted: 08 January 2016 Reviewed: 03 February 2016 Published: 13 April 2016

DOI: 10.5772/62364

From the Monograph

Concise Reliability for Engineers

Authored by Jaroslav Mencik

Chapter metrics overview

2,811 Chapter Downloads

View Full Metrics

Abstract

Failure Mode and Effect Analysis (FMEA) is a simple procedure for systematic revealing of possible failures of structures or processes as early as in the design stage. The main steps of this procedure are explained. Classification of severity, frequency and possibility of early detection of the individual failure modes is shown, as well as the calculation of the risk priority number, which serves for finding the most dangerous causes of failures. The application of FMEA is shown on an example.

Keywords

  • Failure
  • failure mode
  • severity
  • frequency of occurrence
  • risk
  • FMEA

Until now, probabilistic methods were described. In this chapter, a nonprobabilistic method will be explained, which can increase reliability in a very effective way.

Failure modes and effects analysis (FMEA) is a simple procedure for systematic revealing possible failures of a structure or process as early as in the design or project stage and avoiding or mitigating them. The basic idea is that the prevention of failures is better and cheaper than their later detection and repairs. The term failure means here any loss of the ability of the object to perform its functions properly.

FMEA was used for the first time in the Apollo project. Today, it is compulsory in the design of aircrafts; very often it is used in the automotive industry and gradually spreads into other branches. Its use is recommended by quality standards such as ISO 9000. In the past, good designers and builders used a similar approach intuitively. The advantage of FMEA is the fact that it is a systematic procedure guaranteeing that everything will be done to prevent expectable failures of a component, structure, or process. A very important thing is that FMEA is not a matter of one expert only, but uses the knowledge and experience of people from various branches. Their cooperation can have synergic effects and bring further improvements into the design.

Failure modes and effects analysis can be done in 10 steps.

1. Formulation of the problem and establishing a FMEA team

FMEA can be done for a product (component or structure) or a process. A special team is usually formed for the pertinent task. The team should consist of designers, technologists, somebody responsible for the manufacture or building, and somebody representing the future user. His practical experience with the operation and maintenance of similar objects is invaluable.

Every FMEA team has its leader, either appointed by the management or selected by the team itself. The role of the leader is to organize and facilitate the FMEA sessions, to ensure the resources for the work, and to help the team to reach the consensus and to progress toward the completion of FMEA.

Before starting the analysis, it is necessary to define well its scope, the relation of the team to the management, and its competences and responsibility. It is also necessary to set the budget for the analysis as well as the deadline. All this, including the names of the team leader and members and the way of communication with the management should be written down in a document.

Advertisement

2. Review of the construction or process

The purpose for a product FMEA is to reveal problems that could result in safety hazards, product malfunctions, or a shortened life. The key question is “How can the product fail?”

The process FMEA should uncover the problems related to the manufacture, building, or assembly of the product. It is helpful to consider the five elements of a process: people, materials, equipment, methods, and environment. With these elements in mind, the key question is ”How can the process failure affect the product, processing efficiency, or safety?“

During the first session, the members of the team should be sure that they understand all necessary details of the construction or process and their interrelations. To ensure it, every member should get in advance engineering drawings and documents of the product or a detailed flowchart of the process or operation. It is helpful to have an expert on the construction or process, who will be able to answer any questions the team might have.

Advertisement

3. Revealing of all potential failure modes

Once the team members understand the product (or process), they can begin thinking about the potential failure modes that can affect the product quality, reliability and safety during its useful life. This should be done during one or more sessions organized according to the rules of brainstorming; information on previous failures is also useful.

In such meetings, no idea or comments should be rejected. However, some people personally involved in the design might feel offended by somebody’s finding the faults and mistakes. The role of the team leader is to facilitate the process, enhance the people to bring ideas and comments, and mitigate some negative psychological effects.

Advertisement

4. Listing of potential effects of each failure mode

Once the possible failure modes have been identified, they are written down into a special form (Fig. 1). Then, the FMEA team reviews each failure mode and identifies the potential effects of the failure should it occur. For every failure mode, there may be one or more effects. Again, everything is written into the FMEA form. This is very important, as this information is the base for assigning risk ratings to each failure mode. It is recommended to use the if-then thinking: ”If this occurs, what are the consequences?” The form (Fig. 1) helps in taking measures for the elimination of some failures or reduction of their severity.

Figure 1.

Failure modes and effects analysis worksheet. (In real worksheets, both parts are printed together.)

Advertisement

5. Assigning severity, occurrence, and detection ratings for each effect

Each effect is assigned three numbers characterizing its severity, frequency, and probability of early detection, and these numbers are written into the left part of the form (upper part of Fig. 1). Often, each of the ratings is based on a 10-point scale, with 1 being the best case and 10 the worst case; for example,

Severity rating scale:

10 – consequences dangerously high (failure could injure or kill); 8 – consequences very serious (failure renders the object unfit for use); 6 – moderate (failure results in partial malfunction); 4 – very low (there is minor performance loss); 3 – minor (the effects could be overcome without performance loss); 1 – none (failure would not be noticeable).

Occurrence rating scale:

10 – very high probability of occurrence [failure is (almost) inevitable]; 8 – high probability (repeated failures); 6 – moderate probability (occasional failures); 3 – low (relatively few failures); 1 – negligible (failure is unlikely).

Detectability rating scale:

10 – probability of detection (POD) is zero (the object is not inspected or the defect is not detectable); 8 – POD is low (the signs of failure are not easily detectable); 3 – POD is high (the signs of failure are easily detectable, the objects are 100% controlled; 1 – detection of approaching failure is certain [the emerging defect is obvious or there is 100% automatic control (regular inspections, if necessary)].

There are no fixed scales; the classification depends on the character of the object. However, it is important to establish a clear description of the points on each scale so that all team members have the same understanding and consensus of the ratings.

When assigning a severity rating, one must be aware that a single failure of a component can have several effects, and each effect can have a different severity.

The best method for determining the occurrence rating is to use actual data from the same or similar product or process. When actual failure data are not available, the team must estimate how often the pertinent failure mode can occur.

The detection rating tells how likely a failure can be revealed before it happens. If there are no controls, the probability of detection is low and the rating high (9 or 10).

Advertisement

6. Calculation of the Risk Priority Number (RPN) for each failure mode

Now, the RPN is calculated by multiplying the severity rating by the occurrence rating and the detection rating for each item (see the special column in Fig. 1):

RPN = Severity × Occurrence × Detection .E1

This number for a single item can be between 1 and 1000.

Then, the total RPN can be calculated by summing up the risk priority numbers for all failure modes (Fig. 1, at the bottom of the table). This number alone is meaningless, because each FMEA has a different number of failure modes and effects. However, it can serve for comparison with the revised total RPN once the improving measures have been proposed (see further).

Advertisement

7. Prioritizing the failure modes for action

The failure modes can now be ranked from the highest RPN to the lowest RPN. This can easily be accomplished by common spreadsheet programs (e.g. Excel).

The team must now decide which failure modes will be worked on to reduce their RPN. Usually, a limit value of RPN is chosen, and only those items are dealt with, whose RPN was higher. However, special attention must also be paid to all cases with the highest severity rating, such as 8 – 10.

Advertisement

8. Taking action for eliminating or reducing the high-risk failure modes

Each of the high-risk failure modes is discussed, and the team members propose measures to reduce its RPN. This number is a product of three terms (severity, occurrence, and detectability), and the reduction of each of them will reduce the RPN. However, the best way is to eliminate the reason for particular failure. For example, if a steel component can fail due to corrosion, the use of a stainless steel can fully avoid this danger. If there is no failure, there is no need to reduce its severity or frequency, nor improve its detectability.

Then, measures follow for the reduction of severity of a failure or their frequency. (Some of the failure modes have similar reasons.) Improvement can be reached by new design, by using other components or materials, by the improvement of input control for components or raw materials, and discarding the unsuitable ones. The third way to reduce RPN strives at the improvement of detection of failures in early stages (e.g. by building-in special elements or sensors or by periodic inspections). However, this does not mean an actual improvement of the structure.

Advertisement

9. Calculation of the resulting RPN as the failure modes are reduced

For each item corrected, new ratings are determined (severity, occurrence, and detectability) as well as the risk priority number (see part 2 of Fig. 1). Then, the total RPN is calculated for the whole structure. This number can often be several tens of percent lower than the original RPN, partly thanks to the elimination of reasons for some failures. The comparison of both RPN shows how effective the FMEA was. It can also help in deciding what measures should be taken in cases of several possible ways of improvement, with different RPNs.

Advertisement

10. Taking action for improvements

The recommended measures for improvement are written into the FMEA form, including their ratings and RPN. However, the most important thing is to ensure that these measures will be realized. Thus, it must also be proposed who will be responsible for the corrective action, the date to which this action should be carried out, and the person who will check it (with respect to the competences of the FMEA team). The final FMEA forms are then submitted to the management.

Concluding remarks

Failure Mode and Effect Analysis, although it is very simple and does not work explicitly with probabilities, can significantly reduce the number of mistakes happening during the design, manufacture, and assembly or building of an object, as well as the number of failures occurring during its life. Thus, FMEA reduces the total costs and increases the safety, reliability, lifetime, and quality of the object. Very often, the design is improved.

Further details on FMEA can be found in the literature, e.g. [13]. FMEA has been incorporated into reliability standards, such as IEC 60812, and also commercial computer programs for FMEA are available, although the creation of own, purpose-tailored programs is easy.

A variant of FMEA exists, called FMECA (failure mode, effects, and criticality analysis), which puts more emphasis on the assessment of consequences of possible failures [3]. The principle, however, is the same as above.

Example 1

In a Failure Modes and Effects Analysis, done during the design of a home appliance, five possible failure modes were revealed. Their severity (S), probability of occurrence (O), and possibility of early detection (D) were classified as shown in the table below. Calculate the RPN for each failure mode and the resultant RPN for the whole appliance.

Solution

The individual values of RPN (=S×O×D) and the resultant value (=∑RPNi) are written in italics.

Failure mode no. Severity Occurrence Detectability RPN
1 8 6 2 96
2 4.5 7 2 63
3 6 3 4 72
4 2.5 4 7 70
5 5 6 3 90
Total RPN 391

References

  1. 1. McDermott R E, Mikulak R J, Beauregard M R. The Basics of FMEA. Portland: Resource Engineering; 1996. 90 p.
  2. 2. Menčík J. Failure mode and effect analysis - a tool for increasing the reliability and quality of constructions. In: Proc. Int. Conf. „Quality and Reliability in Building Industry“, Levoča, 24. - 26. 10. 2001. Košice: Technical University of Košice; p. 346 – 351
  3. 3. O´Connor Patrick D T. Practical Reliability Engineering. 4th ed. Chichester: John Wiley & Sons; 2002. 513 p.

Written By

Jaroslav Menčík

Submitted: 08 January 2016 Reviewed: 03 February 2016 Published: 13 April 2016