Open access peer-reviewed chapter

Updated Operational Reliability from Degradation Indicators and Adaptive Maintenance Strategy

Written By

Christophe Letot, Lucas Equeter, Clément Dutoit and Pierre Dehombreux

Submitted: 08 December 2016 Reviewed: 18 April 2017 Published: 20 December 2017

DOI: 10.5772/intechopen.69281

From the Edited Volume

System Reliability

Edited by Constantin Volosencu

Chapter metrics overview

1,508 Chapter Downloads

View Full Metrics

Abstract

This chapter is dedicated to the reliability and maintenance of assets that are characterized by a degradation process. The item state is related to a degradation mechanism that represents the unit-to-unit variability and time-varying dynamics of systems. The maintenance scheduling has to be updated considering the degradation history of each item. The research method relies on the updating process of the reliability of a specific asset. Given a degradation process and costs for preventive/corrective maintenance actions, an optimal inspection time is obtained. At this time, the degradation level is measured and a prediction of the degradation is conducted to obtain the next inspection time. A decision criterion is established to decide whether the maintenance action should take place at the current time or postpone. Consequently, there is an optimal number of inspections that allows to extend the useful life of an asset before performing the preventive maintenance action. A numerical case study involving a non-stationary Wiener-based degradation process is proposed as an illustration of the methodology. The results showed that the expected cost per unit of time considering the adaptive maintenance strategy is lower than the expected cost per unit of time obtained for other maintenance policies.

Keywords

  • degradation-based reliability
  • degradation models
  • remaining useful life
  • reliability-based maintenance
  • predictive maintenance
  • numerical case study

1. Introduction

Maintenance is a keystone to ensure the competitiveness of any industry in terms of productivity, quality and availability. According to MIL-STD-3034, standard maintenance (preventive, corrective and inactive) is the action of performing tasks (time-directed, condition-directed, failure-finding, servicing and lubrication) at periodicities (periodic, situational and unscheduled) to ensure the item’s functions (active, passive, evident and hidden) are available until the next scheduled maintenance period. Both preventive maintenance and corrective maintenance tasks are performed on industrial equipment through their operational lifetime. The balance between preventive and corrective maintenance actions is usually ruled by the long-term cost rate, the asset availability or safety criteria. Accordingly, different maintenance strategies are encountered in literature [1]. They concern the replacement of systems subject to random failures and whose states are identified at all time.

However, in some particular cases, the item state may be influenced by some factors especially for mechanical units that have to cope with variable mechanical stresses, a variable energy consumption, a modification of the operating conditions and the effect of the environment. Obviously, the reliability and remaining useful life (RUL) of such equipment will change accordingly. Consequently, the maintenance scheduling has to be updated considering the degradation history. This topic is covered by the degradation-based reliability approach that consists in monitoring degradation covariates with respect to a given threshold in order to trigger inspection or maintenance actions.

Several case studies highlighted that, usually, the failure of an item is to put in relation with a degradation process. Typical examples of such degradation processes are the crack-growth in a mechanical part due to fatigue loading, the wear of cutting tools in machining, the development of corrosion mechanism in reinforced concrete structures and the development of pitting on bearing race. Moreover, a large number of experiments and engineering phenomena show that items of the same category, even from one identical batch, degrade differently from one another in performance. As the failure of an item can lead to dramatic consequences, it is mandatory to assess the specific remaining useful life (RUL) accurately and to schedule the maintenance tasks accordingly for each item. The modelling of the degradation mechanism based on measurements and fitting procedures is a key element to achieve this objective.

Historically, the degradation was first considered at the design stage of an item, using empirical laws for the conception of mechanical parts for fatigue loading cycles (e.g., Palmgren fatigue life for bearings and Paris-Erdogan crack growth relationship). However, experience showed that these empirical degradation models were affected by a significant dispersion on the predicted life, thus enforcing the necessity to consider uncertainties for such models. Consequently, deterministic models were replaced by stochastic models to take into account the unit-to-unit variability and time-varying dynamics for the remaining useful life prediction. Thanks to the development of accurate real-time sensors and dedicated monitoring software, the tracking of the degradation is made possible by measuring related physical variables such as vibrations, temperatures, pressures and forces. The monitoring of those indicators allows to detect faulty behaviours and to forecast a degradation trend, thus allowing for a better remaining useful life prediction. To sum up, the reliability and the remaining useful life can be assessed at three different stages of an item life as illustrated in Figure 1:

  1. In the design stage, the physical degradation mechanism is modelled taking into account the uncertainties in the parameters. This gives a nominal life expectancy of the item that depends on given conditions of usage.

  2. The in-service stage during which the degradation indicators are monitored and alarm thresholds are set. Faulty behaviours due to a process perturbation or external cause can be detected.

  3. The end of life stage from which failure data are used to update both the degradation models and the threshold values for the monitored indicators.

Figure 1.

The three complementary approaches for the reliability and remaining useful life estimation.

Advertisement

2. Degradation-based reliability

2.1. Reliability

The reliability of an item (a part, a machine or a system) is the probability that the item will perform its intended function throughout a specified time interval when operated in a normal (or stated) environment [2]. According to the standards, the term ‘reliability’ also refers to a reliability value and is considered as the probability for an item to be in a functional state. Given a random variable Tf that represents the lifetime of an item, the reliability R(t) is given by the following equation:

R ( t ) = P ( T f > t ) E1

As previously mentioned, the reliability can be identified at different stages of an item life. The fitting step of reliability is usually performed using field failure data or simulated failure times from the design stage. The set of failure data is used to obtain the non-parametric failure function (also called unreliability) F(t) that represents the distribution of the failure times:

F ( t ) = P ( T f t ) = 1 R ( t ) E2

The probability density function f(t) is derived from the failure function:

f ( t ) = d F ( t ) d t E3

Finally, the failure rate (or hazard function) h(t) is defined:

h ( t ) = f ( t ) R ( t ) E4

The failure rate represents the conditional probability of failure of an item during [t, t + Δt] given that this item has survived until time t. The failure rate is a first indicator on the evolution of an item state. An increasing failure rate indicates that the conditional probability of failure over time increases, thus implying a progressive degradation process.

Fitting a parametric reliability model on data is achieved using two methods: the regression method and the maximum likelihood method. For the regression method, the parametric reliability law is transformed into a linear form and a regression fit is performed. The latter is based on the likelihood function of the reliability model to identify the parameters that maximizes the probability of observing the failure data again.

The fitting procedure is illustrated on the two-parameter Weibull law for complete data as example.

2.1.1. The regression method

The failure function of a Weibull law is F ( t ) =   1 exp ( ( t η ) β ) , η being the scale parameter and β the shape parameter. The linear form y = Ax + B of this model for the regression fit is:

ln ( ln ( 1 1 F ^ ( t i ) ) ) = β ^ ln ( t i ) β ^ ln ( η ^ ) E5

With F ^ ( t i ) a non-parametric estimator of the failure function assessed at the ith failure time considering that n items were operational at the beginning of the survival study. Common non-parametric estimators of the failure function are [3]:

  1. the Kaplan-Meier estimator F ^ ( t i ) = i / n ;

  2. the mean rank estimator F ^ ( t i ) = i / ( n + 1 ) ;

  3. the approached rank adjust estimator F ^ ( t i ) = ( i 0.3 ) / ( n + 0.4 ) .

From Eq. (5), the identification of the parameters for the linear regression gives:

β ^ = A E6
η ^ = exp ( B β ^ ) E7

2.1.2. The maximum likelihood method

The likelihood function L considers the product of the probability density function of the model governed by a set of parameters θ, each function being assessed at a failure time ti:

L ( t i |   θ ) = i = 1 n f ( t i ) E8

For the Weibull case, using the log-transformation of the likelihood function, it follows:

ln L ( t i |   β ,   η ) = i = 1 n { ln ( β ^ ) ln ( η ^ ) + ( β ^ 1 ) ( ln ( t i ) ln ( η ^ ) ) ( t i η ^ ) β ^ } E9

Taking the partial derivatives of the log-likelihood function, the estimators of the parameters are [3]:

i = 1 n ( t i β ^ ln ( t i ) ) i = 1 n t i β ^ 1 β ^ 1 n i = 1 n ln t i = 0 E10
η ^ = ( 1 n i = 1 n t i β ^ ) 1 β ^ E11

Sometimes, the produced fit does not match the experimental data. In this case, a third parameter has to be introduced, that is, the location parameter γ that shifts the failure times accordingly. Then a convenient approach consists in testing different values of this location parameter, to apply the regression and to identify the best estimator γ ^ for which the highest determination factor is obtained. The maximum likelihood method can also handle the case of the three-parameter Weibull estimation through numerical optimization of the likelihood function.

2.2. First hitting time process

First hitting (or passage) time processes are used in a wide area of applications including medicine, environmental sciences, engineering sciences, economy and sociology. Accordingly, such processes can describe either the sojourn duration of a patient in a hospital given the gravity of his illness, the time delay before a polluting product reaches an area, the lifetime of mechanical parts given a stochastic damage assessment, etc. Generally speaking, these processes aim at capturing the stochastic behaviour of a given diffusive mechanism to predict the hitting times of a critical threshold.

A first hitting time process has two components:

  • A stochastic (degradation) process noted {Z(t), tT, zZ}, which describes the random evolution of a degradation process (e.g., physics-related processes in the areas of mechanics, chemistry and electricity or non-physics-related processes such as the evolution of quality or a performance indicator) with respect to elapsed time;

  • A given state space boundary value noted zf that defines the failure level of the degradation process.

Given the initial degradation value z0 at the starting time t0, the first hitting time Tf of reaching the critical threshold is [4]:

T f = inf   ( t   |   Z ( t ) Z ( t 0 ) z f z 0 ) E12

Consequently, the first passage time for exceeding the degradation threshold is the first time t for which the stochastic process Z(t) has reached the threshold zf given that it started from the value z0 at initial time t0. Instead of considering the first hitting time, one may be interested in obtaining the probability of crossing the failure threshold:

F ( t | Z ( t ) , z f , z 0 , t 0 ) = P ( T f t ) = P ( Z ( t ) Z ( t 0 ) z f z 0 ) E13

The failure function F(t) is now conditioned by the degradation process Z(t) assessed over time, the failure threshold zf, and the initial values t0 and z0.

2.3. Remaining useful life

Let Z(t) be the evolution of the degradation over time, zf (a positive value) be the failure threshold and X(tj) a degradation measurement at inspection time tj. It is supposed that the degradation process leads to a soft failure (at time Tf), which means that there are no other hard failure modes which compete for the failure time. Considering the first hitting time process of a given threshold, the RUL of an item given the conditional measurement X(tj) at inspection time tj and the preset threshold zf is:

RUL ( t j ) = inf  { l : X ( t j + l ) z f | l 0 ,   X ( t j ) < z f } E14

In order to obtain an accurate estimation of the RUL, the degradation model Z(t) should perfectly fit the degradation data X(t) to minimize the error in the forecasted degradation value Z(tj + l). Practically, the RUL is obtained by the computation of the mean residual life MRL (i.e., the mean value of the RUL conditioned by the observations X(t)) using the following equation [5]:

MRL ( t j ) = E ( T f t | T f > t , X ( t j ) < z f ) = t j R ( u   |   X ( t j ) ) d u E15

With R(u | X|tj)) the conditional degradation-based reliability of the item at time u > tj, given the degradation measurement X(tj).

2.4. Degradation models

According to Gorjian et al. [6], degradation models can be divided into two main families: normal degradation models and accelerated degradation models.

  • Normal degradation model are dedicated to the estimation of reliability for asset operating at normal conditions. Examples of normal degradation models are the general degradation path model, the random process model, the (non-)linear regression models and the time series model. Normal degradation models can also consider some stress factors; such cases are the stress-strength interference model, the cumulative damage/shock model for which the degradation measure is a function of a defined stress.

  • Accelerated degradation models make inference about the reliability at normal conditions given degradation data that were obtained at accelerated time/stress conditions. There exist two categories: the physics-based models and the statistics-based model. In physics-based models, the physical variables of the model (e.g., pressure, temperature and stress) are increased in order to obtain failure data under a reasonable timeframe. Examples of physics-based models are the Arrhenius model for temperature-related degradation mechanism and the inverse power model for non-thermal-related degradation mechanism (e.g., the fatigue damage in bearings). Statistics-based model uses data obtained in various operating conditions to establish a statistical model from a set of input explanatory variables. Example of statistics-based model is the Cox proportional hazards model that expresses the failure rate as the product of a baseline failure rate and a function of the covariates [7].

As previously mentioned, the RUL knowledge is a keystone to offer guidance for an optimal maintenance planning. It has been considered as a fundamental ingredient in the field of Prognostics and Health Management (PHM) [8]. The main challenge of RUL estimation lies in the presence of heterogeneity due to different inner states or external operating conditions of systems. The performance or degradation of a system is caused by interactions of both the inner deterioration and the working environment of the system, justifying the need to take into account the heterogeneity in the degradation model. In this way, it is affected by three kinds of heterogeneity that are the unit-to-unit variability for items from the same batch, the variability in the operating conditions over time and the diversity of tasks and workloads of systems during their life cycles. For each heterogeneity corresponds adequate degradation models. In this study, a focus on data-driven models with unit-to-unit variability and time-varying dynamics of systems is considered.

2.4.1. Random coefficient regression models

Random effects were first considered in random coefficient regression models [9]. At each inspection time tj, a degradation value Xj(tj) is measured on an item i. The degradation model takes the form:

X i ( t j ) = Z ( t i j ; α ; β i ) + ε i j E16

With α = (α1, α2, …, αn) a vector of constant parameters that are characteristics of the tested population and βi = (βi1, βi2, …, βin) a vector of random parameters that are specific to each item i (i.e., α is the vector representing the common part of the degradation, while βi represents the heterogeneity). The term εij represents the measurement error on the degradation value at time tj on the item i and is supposed to follow a Gaussian distribution with a null mean and a standard deviation σε. Common representations of this model are the linear form, the power form and the logarithmic form. However, this simple model has several drawbacks including the need for more historical degradation data from different items of the same category, the difficulty in capturing the time-varying dynamics of the items and the independency between random noise with time [10].

2.4.2. Stochastic process-based models

Stochastic process-based models with random coefficients are able to consider both time-varying dynamics and unit-to-unit variability. These processes may be represented by some specific models that are derived from the Levy Processes family [11]. A levy stochastic process has independent (non-)stationary increments which represent the sequence of successive random and independent displacements of a point in a space. Frequently used models from this family are the gamma process [12] and the Wiener process [13]. According to the results presented in the literature, it seems that stochastic process-based models with random effects can effectively improve the accuracy of RUL estimation in addition to extend the range of applications by considering both cases of monotonous and non-monotonous degradation processes, whether they are linear or non-linear [8]. However in industrial applications, the main drawback of stochastic process-based models with random effects is the computation issue that can be complex and highly dependent of the choice of random parameters and their distribution. Generally, the assumption of normally distributed parameters is chosen [14]. The next section is dedicated to the study of a non-stationary formulation of the Wiener process that is used in the illustrative example at the end of this chapter.

2.5. The Wiener process

The Wiener process has been widely applied to degradation modelling in various fields, for example, bearings, laser generators and milling machines [15]. The Wiener process is particularly a good candidate to represent the evolution of a degradation process that is made of an increasing trend over time with random Gaussian noise, both being proportional to elapsed time. It is characterized by continuous sample paths and independent, (non-)stationary and normally distributed increments [16].

2.5.1. Definition and mathematical properties

A Wiener process-based model has two kinds of parameters: one set related to the expected value of the degradation rate and one that represents the magnitude of the random noise. The generic formulation of a degradation process ZW(t) ruled by a Wiener process-based model is:

Z W ( t ) = Z W ( t 0 ) + m ( t ; θ ) + σ W ( t ) E17

With ZW (t) the initial degradation value at time t0, m(t; θ) the trend function ruled by the set of parameters θ, σ a parameter that represents the magnitude of the Gaussian noise perturbing the trend, W(t) the standard Brownian motion that has the following characteristics:

  • W(0) = 0;

  • W has independent increments, that is, for 0 ≤ t1t2t3t4, the increments W(t4) – W(t3) and W(t2) – W(t1) are independent random variables;

  • W is a continuous stochastic process, and for 0 ≤ t1t2, the increment W(t2) – W(t1) has a normal distribution with mean equals to zero and standard deviation equals to t 2 t 1 .

It follows that the Wiener process-based model can also be formulated as:

Z W ( t ) = Z W ( t 0 ) + N ( m ( t ; θ ) , σ t ) E18

With N ( m ( t ; θ ) , σ t ) the normal distribution with power density function fW(x)

f W ( x ) = 1 σ 2 π t exp ( ( x m ( t ; θ ) ) 2 2 σ 2 t ) E19

Therefore, the mathematical expectation and variance of a Wiener process-based degradation model are:

E ( Z W ( t ) ) = Z W ( t 0 ) + m ( t ; θ ) E20
V ( Z W ( t ) ) = σ 2 t E21

2.5.2. Fitting the Wiener process

Given a set of n + 1 measurements of degradation data z0, z1, z2, …, zn at inspection times t0, t1, t2, …, tn, the fitting procedure of a Wiener process-based degradation model is achieved mainly using the maximum likelihood method [17]. This method allows to obtain the value of the parameters θ and σ from the power density function (pdf) of the Wiener process-based model, each pdf function being assessed at the measurements points. The likelihood function is:

L ( t , z | θ ,   σ ) = i = 0 n 1 1 σ 2 π ( t i + 1 t i ) exp ( [ ( z i + 1 z i ) ( m ( t i + 1 ; θ ) m ( t i ; θ ) ) ] 2 2 σ 2 ( t i + 1 t i ) ) E22

For the stationary Wiener process (i.e., m(t; θ) μ.t is a linear function of time), the estimation of the parameters μ, σ is obtained by taking the partial derivative of the log-likelihood function and searching for the roots [17]:

μ ^ =   i = 0 n 1 ( t i + 1 t i ) i = 0 n 1 ( z i + 1 z i ) =   ( z n z 0 ) ( t n t 0 ) E23
σ ^ = 1 n 1 i = 0 n 1 [ ( z i z i 1 ) μ ^ ( t i t i 1 ) ] 2 Δ t i E24

For non-stationary Wiener processes, the parameters are obtained using optimization techniques such as the Quasi Newton methods for instance [18].

2.5.3. FHT and RUL distribution of a Wiener process

Given a degradation threshold value zc and initial degradation value z0, the hitting times of a Wiener process-based degradation model follow an Inverse Gaussian law with mean parameter equals to m−1(zcz0|θ) (i.e., the inverse function of m(t | θ) and shape parameter equals to ( z c z 0 ) 2 σ 2 that has the following power density function fIG:

f I G ( t | z c , z 0 , θ ,   σ ) = z c z 0 σ 2 π t 3 exp { ( z c z 0 ) 2 2 t σ 2 [ x m 1 ( z c z 0 |   θ ) ] 2   [ m 1 ( z c z 0 |   θ ) ] 2 } E25

The corresponding reliability considering the last measurement zi at inspection time ti is:

R ( t | z c , z i , t i , θ ,   σ ) = 1 t i t z c z i σ 2 π x 3 exp { ( z c z i ) 2 2 x σ 2 [ x m 1 ( z c z i |   θ ) ] 2   [ m 1 ( z c z i |   θ ) ] 2 } d x E26

As the parameters of the Wiener process-based degradation model are updated for each new measurement, the reliability function given by Eq. (26) is a dynamic reliability, that is, the reliability is updated given the updated estimation of the parameters and the last degradation measurement. Consequently, it corresponds to the RUL distribution over time that is assessed at different inspection times.

Advertisement

3. Maintenance model

3.1. General assumptions

  • The failure time of an item of equipment is ruled by a stochastic degradation process, that is, it corresponds to the hitting time of a degradation threshold.

  • Whenever a failure occurred, a corrective maintenance action is performed immediately, that is, the degradation process cannot cross the threshold and there is no duration in failed state to consider.

  • The degradation process itself is not altered by the maintenance actions, that is, the parameters of the degradation process remain unchanged. Maintenance actions only affect the recovery values of the degradation at inspection times.

  • Both preventive maintenance and corrective maintenance are considered, that is, the failure of an item of equipment does not lead to dramatic consequences (only higher cost values). For preventive maintenance, an optimal inspection time tj = Tp is obtained considering the balance between preventive and corrective costs.

  • For the predictive maintenance approach at each inspection time, two strategies are considered depending on the measured degradation level. If the degradation is lower than expected, then the maintenance action is postponed to another inspection time considering the predicted degradation distribution. Otherwise, the maintenance action is conducted at current inspection time. The decision to do or postpone the maintenance action is given by a cost criterion.

  • In case of a corrective maintenance or a preventive replacement, the item is replaced by a new one (AGAN maintenance) so that z0 = 0, that is, the expected life of the item corresponds to its mean time to failure (MTTF).

  • Failure of the asset is tolerated, that is, it does not lead to catastrophic consequences.

  • Downtimes due to unavailability of the maintenance staff or resources are not considered at this stage, that is, the duration of maintenance actions is considered as negligible compared to the life duration of the asset (i.e., MTTR < MUT).

3.2. Maintenance policies

Four types of maintenance policies are considered that are the corrective maintenance, the preventive systematic maintenance, the preventive condition-based maintenance (CBM) and the predictive maintenance [19].

3.2.1. The corrective maintenance

The maintenance task is carried out after failure of the asset to identify, isolate and rectify a fault in order to restore the failed equipment, machine or system in an operational condition. The timing for corrective maintenance can be immediate (the restoration process starts immediately after a failure) or deferred (the maintenance tasks are delayed given a set or maintenance rules). A corrective maintenance policy is mainly used for low value assets, equipment for which the failures do not lead to catastrophic consequences or item for which the RUL is hard to predict due to random failures.

3.2.2. The preventive systematic maintenance

Also known as calendar-based, clock-based or time-based maintenance, it is a maintenance action of an asset according to a scheduled timetable (i.e., a given periodicity between consecutives maintenance tasks). It is mainly applied for critical assets to prevent failures, or routine inspections that occur on a regular basis to control the state of safety equipment. The optimal periodicity is obtained given the reliability of the item, and the relative costs between a preventive maintenance and a corrective maintenance.

3.2.3. The preventive condition-based maintenance (CBM)

The preventive maintenance actions are based on the condition of the component being maintained. The condition of assets is tracked over time using statistical process control techniques, monitoring equipment performance through regular inspections. Measuring the variable of interest directly is usually difficult to achieve, and in this case, some other related variables are used to obtain the estimates of the variable of interest (e.g., bearings wear can be accessed through vibration, noise or temperatures analyses). Once the related indicators have crossed a given threshold, the preventive maintenance action is performed.

3.2.4. The predictive maintenance

An extension of the condition-based maintenance in that way that the state or degradation level of the asset is forecasted to predict the failure time and adapt the maintenance tasks accordingly. An alternative denomination is the adaptive maintenance as the maintenance scheduling is continuously adapted according to the updated actual degradation level and its forecasting.

3.3. The cost model

3.3.1. Corrective and age replacement cost model

In a context of a reliability-centred maintenance approach, the cost maintenance model is based on the reliability calculation that allows to obtain the most relevant time to perform the preventive maintenance in order to reach the optimum expected maintenance cost per unit of time. A generic age replacement model is used as preventive maintenance model [19]

c ¯ m ( T p ) = F ( T p ) . C c + R ( T p ) . C p MUT | T p = F ( T p ) . C c + R ( T p ) . C p 0 T p R ( t ) d t E27

with Tp is the time of preventive maintenance; F(Tp) represents the probability of having a failure at time Tp given the degradation-based reliability model; Cc is the total corrective cost incurred when a failure occurs; Cp is the total cost due to a preventive maintenance action; c ¯ m is the average cost per unit of time that has to be optimized; MUT|Tp is the mean up time under a preventive maintenance policy.

Considering the cost contributions, Cc and Cp are expressed as follows:

C c = MTTR c ( τ sto + τ int c ) + C cst c + P cst c E28
C p = MTTR p ( τ sto + τ int p ) + C cst p + P cst p E29

MTTRc and MTTRp are the mean times to restore, respectively, for a corrective maintenance and for a preventive maintenance, τsto the variable losses per unit of time due to unavailability of the asset, τint the variable costs per unit of time, Ccst the fixed part of the costs, Pcst the fixed part of the losses and the subscripts ‘c’ and ‘p’ standing for corrective and preventive maintenance, respectively.

From Eq. (27), considering an infinite time to perform a preventive maintenance leads to the pure corrective maintenance model, that is

c ¯ m c ( ) = F ( ) . C c + R ( ) . C p 0 R ( t ) d t = C c MTTF E30

with MTTF the mean time to failure.

3.3.2. Introducing the inspection cost

Considering the condition-based and the predictive maintenance models, measurements are required to assess the current degradation level and to forecast its trend. It is considered that these measurements are performed through inspections that are considered as additional costs. In this case, the total maintenance cost over time is [20]:

C ( t ) =   C i N i ( t ) + C p N p ( t ) + C c N c ( t ) E31

With Cx and Nx(t) the cost and the counter of inspection, preventive actions and corrective tasks. On the one hand, the inspections will increase the total cost, but on the other hand, it will allow to avoid failures as well as to increase the useful life of equipment. Consequently, there is an optimum number of inspections to consider.

Considering the condition-based maintenance scenario, the inspections should take place at a given periodicity that will reasonably decrease the probability of crossing the degradation threshold without being too frequent.

For the predictive maintenance scenario, it is considered that at least one inspection will take place at the time corresponding to the calendar-based preventive maintenance model (i.e., the inspection will guide the decision of performing the preventive maintenance action or postponing it at another inspection time). Practically, at the first inspection time tj=1 corresponding to the time of preventive maintenance, the following criterion is assessed:

K C ( t 1 = T p ) = C p t 1 F * ( t 2 ) . C c + R * ( t 2 ) . C p + C i t 1 + t 1 t 2 R * ( t ) d t E32

F* and R* being the updated failure and reliability function given the last degradation measurement Z(t1) and t 2 = T p * = T p ( t 1 | Z ( t 1 ) ) the next forecasted inspection time given the degradation level Z(t1) measured at the 1st inspection time t1. This criterion represents the ratio between the strategy that replaces the equipment at time t1 and the strategy to postpone the replacement at time t 2 = T p * . Actually, the numerator represents the cost rate obtained for a lifecycle t1 and the denominator the cost rate for an expected lifecycle t2 that is obtained given the last degradation measurement and considering the corrective and preventive cost values. If KC > 1 it is cheaper to postpone the preventive maintenance action to the next inspection time predicted from the degradation-based reliability model. When KC ≤ 1, the maintenance is performed at the last inspection time reached.

At inspection time tj + 1, the criterion is assessed considering the total elapsed time and the number of inspections already performed, that is, the general form of the criterion is:

K C ( t j ) = C p + ( j 1 ) . C i t j F * ( t j + 1 ) . C c + R * ( t j + 1 ) . C p + j . C i t j + t j t j + 1 R * ( t ) d t E33

As long as KC(tj) > 1, the maintenance action is delayed to next inspection time tj+1.

Advertisement

4. Methodology

This section summarizes the methodology that consists of updating a degradation-based reliability model from data as well as the maintenance optimization for preventive replacement that leads to an adaptive maintenance model. Considering a completely new asset for which neither reliability nor degradation information is provided, the methodology focuses on four stages that correspond to the four maintenance policies related to the knowledge level of the reliability and degradation process of the asset.

4.1. Run-to-failure stage

As no information is available on the asset, the first stage consists to let the asset running until its failure before performing a corrective maintenance action to restore it in AGAN condition. This provides a set of failure times that is used to fit a parametric reliability model as presented in Section 2.1.

4.2. Systematic preventive maintenance stage

According to the parametric reliability of the asset and the corrective and preventive maintenance costs, an optimal periodicity Tp is obtained using Eq. (27).

4.3. Monitoring the degradation and CBM stage

The third stage consists in monitoring the degradation process to fit a degradation model that will be used in the last stage. Consequently, the monitoring of the data should be tuned so that the measurements points are sufficient for the modelling. Two design variables are to be defined as the preventive degradation threshold beyond which the preventive task is performed and the degradation measurements periodicity. Generally, these variables are adjusted using experimental design, sensitivity analyses and return of experience. Given a penalty cost for the monitoring (e.g., inspection cost) and practical constraints, an optimal set of these parameters can be identified.

4.4. Forecasting the degradation and adaptive maintenance stage

Given the degradation measurements collected in the previous stage, a degradation model identification can be attempted. The selection of the most suitable model is complexed; it depends on the nature of the degradation data and the sample size. Goodness of fit criterion is used to give guidance on the most suitable degradation model. Once the degradation model has been identified, the adaptive maintenance stage defines the first inspection time as the time corresponding to the preventive systematic maintenance periodicity t1 = Tp. Considering that the item has survived until this time, an inspection is performed and the degradation level is measured. Given the degradation model Z(t), the distribution of the hitting times is updated so as the failure function density F*(t). A new reliability model is then fitted on this failure function from which we can deduce the mean residual life as well as a new optimized time T p * for a preventive replacement. At this step, the cost criterion Kc is assessed if Kc > 1, the maintenance is postponed to the next inspection time t j + 1 = T p * ; otherwise, the preventive maintenance action is performed at the current inspection tj. Figure 2 shows an illustration of the updating process of both the degradation and the threshold hitting times distribution.

Figure 2.

Illustration of the adaptive maintenance and graphical interpretation of the cost criterion Kc.

The superscript ‘*’ stands for any value or parametric law that is updated given the last degradation measurement. As the reliability model R*(t) is updated each time, a new degradation data is collected and the adaptive maintenance model is also updated, respectively. This methodology permits to increase the useful life of an item of equipment for which the specific degradation path is quite optimistic compared to the mean trend and to obtain a better estimation of the mean residual life in general. Figure 3 presents the simulation flowchart of the adaptive maintenance model. Once the degradation model is identified, the first step consists in the simulation of random degradation paths and the computation of the hitting times of the degradation threshold corresponding to the failed state. From this collection of hitting times, a statistical generic reliability law is computed that determines the first inspection time tj = Tp given the costs of the different maintenance actions. At this time and considering that the item has not failed, an inspection is performed to measure the degradation level. From this degradation value and given the degradation process, new degradation trajectories are simulated in order to obtain a new set of hitting times of the threshold and the reliability law is updated accordingly. The next step is the assessment of the cost criterion KC(tj) that decides whether or not a maintenance action should occur at the present inspection time. Obviously, the item may fail between consecutive inspection times, which leads to a corrective maintenance (AGAN replacement). In this case, the failure time is added on the failure times database in order to update the time of the first inspection.

Figure 3.

The simulation flowchart of the adaptive maintenance model.

Advertisement

5. An illustrative example

The methodology to obtain an adaptive maintenance model is applied on an illustrative example. The degradation model used in this example is a non-stationary Wiener process as presented in Section 2.5. The non-stationary Wiener process is as follows:

Z ( t ) = Z ( t 0 ) + a t b + σ W ( t ) E34

With a, b and σ being random parameters for each degradation path. It is supposed that those parameters follow a uniform distribution with inferior and superior boundaries equalling to 0.8 and 1.2. The degradation failure threshold is set to zf = 100, and each degradation path has an initial degradation Z(t0)= z0 = 0. This degradation model is supposed to be unknown for the first two stages of the study. The model is used to generate failure times during the first stage (i.e., the crossing times of the failure threshold). Figure 4 shows three simulations of the degradation process.

Figure 4.

Three simulated paths of the Wiener-based degradation process. The corresponding failure times are, respectively, 42.53, 122.75 and 197.75 days.

The maintenance costs are as follows:

  • Correction maintenance action, Cc = 2500 €

  • Preventive maintenance action, Cp = 500 €

  • Inspection cost, Ci = 50 €

5.1. Stage 1: run to failure

In the first stage, it is considered that the asset is running until failure. The distribution of the failure times is supposed to be unknown at first. From a collection of 5000 failure times, a three-parameter Weibull distribution is fitted using the maximum likelihood method. The estimated parameters are β ^ = 1.2357 , η ^ = 90.4343 , and γ ^ = 39.0214 . The mean time to failure computed with the simulated failure times is MTTF = 126.76 days, and the expected value of the fitted Weibull distribution is 123.47 days. Figure 5 shows a comparison of the pdf histogram obtained with failure data and the pdf of the fitted Weibull distribution. The expected corrective maintenance cost is c ¯ m c ( ) = 20.25   /day (see Eq. (30)).

Figure 5.

Probability density function from simulated failure times and estimated three-parameter Weibull pdf function.

5.2. Stage 2: systematic preventive maintenance

Considering the reliability obtained at the previous stage, the optimum systematic preventive maintenance periodicity is Tp = 41.79 days with an expected daily maintenance cost c ¯ m ( T p ) = 12.61 €/day. Figure 6 represents the evolution of the expected maintenance cost for different values of Tp (see Eq. (27)).

Figure 6.

Optimum systematic preventive maintenance periodicity Tp.

5.3. Stage 3: condition-based maintenance (CBM)

In stage 3, the degradation is monitored. The purpose is to collect sufficient data for modelling the degradation process as well as performing the condition-based maintenance. In order to find the optimal set of design parameters that are the periodicity of inspection Ti and the preventive degradation threshold zCM for the condition monitoring, a Monte Carlo simulation approach was conducted. For each scenario run, 5000 simulations were performed to reach the stationary maintenance cost. A minimum condition-based maintenance cost value of 8.55 €/day was reached for the set of variables [Ti = 22 days; zCM = 64]. Figure 7 shows the surface plot of the related CBM cost per unit of time. The white sphere, located in the optimal region, represents the minimum cost value obtained. Conducting inspections too frequently leads to additional costs, and on the other hand, considering long duration between inspections increases the probability of failure and related corrective costs. Similarly, setting the condition monitoring degradation threshold close to the failure degradation level increases the likelihood of failure; and on the other hand, setting the condition monitoring degradation threshold to a very low level leads to precocious replacement of the asset thus shortening its useful life.

Figure 7.

Plot of the condition monitoring maintenance cost with respect to the periodicity of inspection Ti and condition monitoring degradation threshold zCM. The optimum cost value is 8.55 €/day for the set of variables [Ti = 22 days; zCM = 64].

5.4. Stage 4: degradation-based adaptive maintenance

In this last stage, the adaptive maintenance methodology is set up. Given the monitoring of degradation data, the Wiener process-based degradation model Z(t) can be identified for each degradation path using the maximum likelihood method. For each run, an inspection is conducted at t1 = 41.79 days (i.e., the scheduled time for systematic preventive maintenance). The degradation level Z(t1) is measured and the cost criterion is assessed Kc(t1) according to Eq. (33). If Kc(t1) < 1, the item is replaced at t1 and only the cost of preventive maintenance is due; otherwise, the next inspection is scheduled to t 2 = T p ( t 1 | Z ( t 1 ) ) given the updated RUL of the item. The same procedure is repeated for each inspection time tj until Kc(tj) < 1. Figure 8 shows an example of the adaptive maintenance methodology. For each simulation, the degradation process is supposed to be known. For this specific degradation path, three inspections were performed and the item was preventively replaced at the fourth inspection time.

Figure 8.

Illustration of the adaptive maintenance policy on a specific degradation path. The corresponding threshold hitting time being high, the adaptive maintenance allows to extend the usage of the asset, thus fully exploiting its useful life.

5.5. Maintenance policies comparison

Figure 9 represents the maintenance cost per unit of time obtained over 5000 simulations for each maintenance policy. The expected theoretical maintenance costs for the corrective and preventive maintenance policies are also represented (dash lines).

Figure 9.

Maintenance cost per unit of time for the different maintenance policies.

At the end of the 5000 simulations, the number of inspections and failure events for each maintenance policy are the following:

  • Corrective maintenance: 0 inspection and 5000 failures.

  • Systematic preventive maintenance: 0 inspection and 40 failures.

  • Condition-based maintenance: 20,093 inspections and 182 failures.

  • Adaptive maintenance: 7397 inspections and 200 failures.

Due to the distribution of the failure times, the calendar-based preventive maintenance had the minimum number of failure events; but on the other hand, it reduces the useful life of item since the replacement takes place at t = 71.79 days no matter the degradation level. Comparing the condition-based maintenance and the adaptive maintenance, the latter had slightly more failures events but required almost three times less inspections. The fact that both condition-based maintenance and adaptive maintenance had more failures than calendar-based maintenance makes sense: each time the preventive maintenance is postponed, there is a risk of a failure to occur between consecutive inspections.

Advertisement

6. Conclusion

This chapter was dedicated to the presentation of an adaptive maintenance methodology for extending the useful life of asset. The methodology uses the reliability-centred maintenance approach as well as the degradation-based reliability approach to define a degradation-based adaptive maintenance model. Background reliability information was presented in Section 1. Section 2 was devoted to the presentation of the degradation-based reliability approach with a focus on the stochastic processes. Section 3 detailed the age-based maintenance cost model and its extension to consider inspection costs, which lead to the definition of the cost criterion KC used to justify the best maintenance action to perform at each inspection time. Section 4 was a sum up of the methodology, highlighting the procedure of updating the reliability estimation step and degradation paths prediction given the last measurement. The methodology was applied on a numerical example using a non-stationary Wiener-based degradation process with random parameters. Four maintenance policies, from the run to failure to the adaptive maintenance stage, were compared. The results showed that the adaptive maintenance model had the minimum maintenance cost per unit of time. However, there are still challenges to cope with to improve the methodology, for example:

  • The failure threshold definition that can be hard to set. An elegant solution would be to consider a probabilistic distribution of this threshold instead of a deterministic value or to combine expert judgement with fuzzy logic to take into account the uncertainties.

  • The degradation modelling is also a tricky step, especially for degradation process with changing degradation rate and load dependant. While stochastic processes can consider both unit-to-unit variability and time-varying dynamics of systems, the fitting procedure of such process might lead to inaccurate model. Given the degradation data history, the fitting procedure should select the most relevant measurements to accurately predict the future behaviour especially for non-stationary and non-monotonous degradation mechanism.

  • Finally, the adaptive maintenance methodology should be extended to the case of a system made of several components, each of them being ruled by its specific degradation mechanism.

This methodology can be applied for any asset given that degradation measurements and degradation modelling are possible. Examples of application are the replacement of cutting tools in machining process by monitoring the requested power, the replacement of bearings through vibration monitoring techniques and the maintenance scheduling of railway track sections given the assessment of railway track condition geometry.

References

  1. 1. Nakagawa T. Imperfect preventive maintenance. IEEE Transactions on Reliability. 1979;28(5):402. DOI: 10.1109/TR.1979.5220657
  2. 2. Blischke WR, Prabhakar Murty DN. Introduction and overview. In: Blischke WR, Prabhakar Murty DN, editors. Case Studies in Reliability and Maintenance. Hoboken, New Jersey: Wiley-Interscience; John Wiley and Sons; 2003. pp. 1–34. DOI: 10.1002/0471393002.ch1
  3. 3. Prabhakar Murthy DN, Xie M, Jiang R. Parameter estimation. In: Shewhart WA, Wilks SS, editors. Weibull Models. Hoboken, New Jersey, USA:Wiley-Interscience, John Wiley & Sons; 2004. DOI: 10.1002/047147326X
  4. 4. Bagdonavicius V, Nikulin M. Accelerated Life: Models Modeling and Statistical Analysis. In: Cox DR et al., editors. Monographs on Statistics and Applied Probability 94. Boca Raton, Florida, USA: Chapman and Hall/CRC; 2002. p. 334. DOI: 10.1201/9781420035872
  5. 5. Huynh KT, Castro I, Barros A, Berenguer C. On the construction of mean residual life for maintenance decision-making. In: 8th IFAC Symposium on Fault Detection Supervision and Safety of Technical Processes; 29-31 August 2012; Mexico City, Mexico. 2012. pp. 654–659. DOI: 10.3182/20120829-3-MX-2028.00144
  6. 6. Gorjian N, Ma L, Mittinty M, Yarlagadda P, Sun Y. A review on degradation models in reliability analysis. In: Kiritsis D, Emmanouilidis C, Koronios A, Mathew J, editors. Engineering Asset Lifecycle Management; 28-30 September 2009. Athens, Greece. London: Springer; 2010. pp. 369–384. DOI: 10.1007/978-0-85729-320-6_42
  7. 7. Kleinbaum DG, Klein M. Survival Analysis: A Self-Learning Text. 3rd ed. London: Springer; 2012. p. 700. DOI: 10.1007/978-1-4419-6646-9
  8. 8. Zhang Z, Si X, Hu C, Kong X. Degradation modeling-based remaining useful life estimation: A review on approaches for systems with heterogeneity. Proceedings of the Institution of Mechanical Engineers Part O: Journal of Risk and Reliability. 2015;1:1–13. DOI: 10.1177/1748006X15579322
  9. 9. Lu CJ, Meeker WQ. An accelerated life test model based on reliability kinetics. Technometrics. 1993;37(2):161–174. DOI: 10.2307/1269615
  10. 10. Wang W, Christer A. Towards a general condition-based maintenance for a stochastic dynamic system. Journal of the Operational Research Society. 2000;51(4):145–155. DOI: 10.2307/254254
  11. 11. Barndorff-Nielsen OE, Mikosch T, Resnick SI. Lévy Processes. 1st ed. Basel: Birkhäuser; 2001. p. 418. DOI: 10.1007/978-1-4612-0197-7
  12. 12. Van Noortwijk JM. A survey of the application of gamma processes in maintenance. Reliability and System Safety. 2009;94(1):2–21. DOI: http://dx.doi.org/10.1016/j.ress.2007.03.019
  13. 13. Wang X. Wiener processes with random effects for degradation data. Journal of Multivariate Analysis. 2010;101(1):340–351. DOI: 10.1016/j.jmva.2008.12.007
  14. 14. Lu CJ, Meeker WQ. Using degradation measures to estimate a time-to-failure distribution. Technometrics. 1993;35(2):161–174. DOI: 10.2307/1269661
  15. 15. Tang SJ, Guo XS, Yu CQ, Zhou ZJ, Zhou ZF, Zhang BC. Real time remaining useful life prediction based on nonlinear Wiener based degradation processes with measurement errors. Journal of Central South University. 2014;21(12):4509–4517. DOI: 10.1007/s11771-014-2455-9
  16. 16. Si XS, Wang W, Hu CH, Chen MY, Zhou DH. A Wiener-process-based degradation model with a recursive filter algorithm for remaining useful life estimation. Mechanical Systems and Signal Processing. 2013;35(1-2):219–237. DOI: http://dx.doi.org/10.1016/j.ymssp.2012.08.016
  17. 17. Kahle W, Lehmann A. The wiener process as a degradation model. In: Nikulin MS, Limnios N, Balakrishnan N, Kahle W, Huber-Carol C, editors. Advances in Degradation Modeling. 1st ed. Basel: Birkhäuser; 2010. p. 416. DOI: 10.1007/978-0-8176-4924-1
  18. 18. Bonnans JF, Gilbert JC, Lemarechal C, Sagastizabal CA. Numerical Optimization: Theoretical and Practical Aspects. 2nd ed. Berlin: Springer-Verlag Berlin Heidelberg; 2006. p. 494. DOI: 10.1007/978-3-540-35447-5
  19. 19. Hoang P, editor. Handbook of Reliability Engineering. 1st ed. London: Springer-Verlag; 2003. p. 663. DOI: 10.1007/b97414
  20. 20. Huynh KT, Barros A, Berenguer C. Maintenance decision-making for systems operating under indirect condition monitoring: Value of online information and impact of measurement uncertainty. IEEE Transactions on Reliability. 2012;61(2):410–425. DOI: 10.1109/TR.2012.2194174

Written By

Christophe Letot, Lucas Equeter, Clément Dutoit and Pierre Dehombreux

Submitted: 08 December 2016 Reviewed: 18 April 2017 Published: 20 December 2017