System safety assessment (SSA) has become a standard practice in air traffic management (ATM). System safety assessment aims, through a systematic and formal process, to detect, quantify, and diminish the derived risks and to guarantee that critical safety systems achieve the level of safety approved by the regulatory authorities. Verification of compliance with the established safety levels becomes the last but an essential part of the safety assurance process. This chapter provides a Bayesian inference methodology to assess and evaluate the compliance with the established safety levels under the presence of uncertainty in the assessment of systems performances.
- risk assessment
- Bayesian inference
- safety compliance
Safety in aviation, and particularly in air traffic management (ATM), has evolved to the concepts of safety management and risk management. To achieve and guarantee safety, operators and providers develop and implement safety management system (SMS). SMS is a methodical and explicit approach for handling safety that comprises the required organisational arrangements and accountabilities, as well as the applicable safety policies and safety procedures. Hazard identification, risk assessment and risk mitigation have become essential processes within the framework of the SMS. Manufacturers, air navigation service providers (ANSPs) and operators shall implement a formal risk management process within their SMS.
This process, known as safety assessment (SA), has become a standard practice in the aviation industry. The global aim of SA is to ensure (by means of formal and systematic identification, evaluation and management of risks connected with hazards) that the design, production and operation of a system attain the safety levels settled by the safety regulatory authorities. Safety assessment has become a standard practice in the aviation industry [1, 2, 3, 4, 5, 6, 7, 8].
FHA—Functional hazard assessment
PSSA—Preliminary system safety assessment
SSA—System safety assessment
Figure 1 illustrates the liaisons between these three phases and the system life cycle.
System definition is the first stage of the system lifecycle. Its purposes are as follows:
To establish initial objectives for the system operating within its pertinent operational environment
To define the functions to support these objectives
To agree on high-level system requirements and interfaces
From the safety perspective, the first phase in the SA is referred to as functional hazard assessment (FHA). FHA aims to specify the safety level to be attained by the system in terms of safety objectives. A safety objective is a qualitative or quantitative statement that outlines the maximum acceptable frequency or probability of occurrence for a specific hazard or failure condition. If the hazard is a system failure, the safety objective will be the maximum allowed rate of failure. FHA is executed at the start of system design and development before the functions of the system have been deployed into procedures, equipment, or people components.
To determined system safety objectives, each function and combination of functions is assessed by safety analysts to:
Identify possible hazards and failures modes derived from the system definition.
Identify hazard consequences or effects on operations.
Evaluate the severity of each hazard consequences.
Determine safety objectives, i.e. the maximum acceptable frequency for each hazard’s occurrence.
Assess intended aggregated risk.
The following lifecycle stage is a system design. At this stage, the system operation and functions are defined in detail, describing the new system as an assortment of subsystems or components. In parallel, the risk assessment process develops a preliminary system safety assessment (PSSA). The objective of the PSSA is to prove that the designed system architecture can soundly attain the safety objectives stated during the FHA.
PSSA inspects the system architecture and concludes how failures could cause the hazards acknowledged in the FHA, it identifies required mitigations to minimise the risk or even eradicate them, and it specifies these measures in the form of safety requirements. A safety requirement is a risk measure, which may cover several different aspects such as operational, human, functional, organisational, procedural and performance, among others. Therefore, the PSSA process apportions safety objectives to the system elements and generates safety requirements, and then it stipulates the level of risk of each system element. The system architecture will meet the safety objectives established at the system level at the FHA, only if the architecture components satisfy their safety requirements.
After design, the next steps in the system lifecycle are implementation and integration. System implementation includes the production of the individual components, and integration refers to their amalgamation into the system. The next step, known as transfer into operations, refers to the system deployment, its on-site installation, its integration as part of an operational environment and the validation of its performances. During the system operation, maintenance actions, preventive and corrective, are accomplished in order to preserve the required safety and service level. Finally, once the system has reached the end of its operational life, decommissioning stands for the system withdrawal from the operation.
The last stage of the safety process, the system safety assessment (SSA), is developed in parallel to system implementation to verify whether the system, as implemented, achieve an acceptable risk. This means that the envisage mitigations have been put in place; all safety goals, objectives and requirements have been satisfied; and the expected level of safety has been successfully attained during the system operation [13, 14].
SSA monitors the safety performances of the system through its lifetime. It collects evidence and arguments to confirm that each implemented system component satisfies its safety requirements and safety objectives. It is, in the end, a continuous safety compliance assessment .
The SA process, although extended and widely accepted in aviation, is affected by a series of limitations. The main limitation neither resides in the fact that the process does not sufficiently considers nor widely capture the inherent uncertainty in every step of the safety assessment. The process has also shown limitations in dealing with lacking data if when the system is brand new or when there is few measurable information about its performance. These limitations severely affect the effectiveness of the last step in the process, the system safety assessment. Additionally, many times, decision-makers cannot support their safety compliance decision on objective tools. As a consequence, the process has not enough objectivity or transparency.
This chapter illustrates that a systematic approach for dealing with uncertainties in safety compliance evaluation is possible through Bayesian reasoning. Bayesian inference is a systematic method that helps decision-makers to select a suitable path in relation to the acceptance of a system against its safety results. It is particularly useful if under the presence of uncertainty about the actual failure rates of a system and/or about the consequences of the decision-making process. It could also take into consideration the predilections of the decision-makers, experts’ understanding and the consequences of the decisions to be made.
2. System safety assessment limitations
Most safety assessment decisions are taken under the assumption that the magnitudes of the variables and parameters describing the system performance are equal to their estimates. But, this postulation is valid as long as there is enough data or precise expertise for an accurate estimation of the system parameters . This does not happen in many situations, particularly for new systems where only tiny information is accessible about its performance. Uncertainty also comes from partial or imprecise models or deficient data gathering.
There are several approaches to the concept of uncertainty [17, 18, 19]. Uncertainty is often understood as a “state of knowledge” . Ayyub  describes it in terms of knowledge imperfection due to intrinsic shortages of knowledge acquisition. Walker  expresses uncertainty as “any departure from the unachievable ideal of complete determinism”. Aven  defines it as “….lack of understanding about the behaviour and outcomes of a system, and discernible magnitudes”.
Although there is a wide variety of definition for the concept of uncertainty, the common element in all of them is the notion of deficient or partial knowledge of a system and its performances because of shortages in apparent information and noticeable data [24, 25].
Uncertainty denotes the nondeterministic conduct of a system and the ambiguous magnitudes of the parameters that define how the systems behave or perform. It might have an epistemic or aleatory nature. Aleatory uncertainty accounts for the usual disparity of the physical phenomena. Epistemic uncertainty accounts for the limited knowledge of the parameters used to describe and explain the system [26, 27].
Both types of uncertainties are an essential component of any safety assessment. Uncertainty is introduced through the SA process at several stages. During FHA uncertainty is related to the modes of failure and the consequences of such failures. There are also uncertainties related to the extent of the consequences and consequently to the severity assigned of every failure condition. All these uncertainties are also translated into the assignment of SO—safety objective (the lower frequency of occurrence admissible for each failure circumstance), and into the derivation of safety requirements during the PSSA. During the SSA, uncertainties will come from inaccurate or incomplete medialization or data gathering.
The current safety compliance process acknowledges that multiple potential failure situations are possible, i.e. a single failure condition or hazard might lead to several failure modes and, accordingly, to diverse effects and consequences. This uncertainty has been traditionally mitigated with the definition of the worst-case scenario. This way to proceed appears to be too biased and over-conservative, which lead to excessively conservative safety requirements. The consideration of worst-case scenario incorporates a sort of guard band to reduce the chance of accepting a system that does not perform safely enough. This guard band implies a cost to the system. This could only be evaded if the decision-maker has truthful (i.e. not conservative neither optimistic) guesses of the uncertainties in the magnitudes backing up the decisions.
As can be seen, most decision-making processes in safety compliance assessment during SSA imply judgement of safety performance in a context with uncertainty [28, 29]. However, the existing SSA process does not comprise a methodical process to cope with all those uncertainties. Today, SSA is reduced to gathering evidence and a simple binary comparison of those evidence towards safety goals and requirements.
3. System acceptance decision under uncertainty
Let us consider that the outcome of the SSA process is a dual pronouncement by the safety regulator to authorise, or not, the operation of a system. To help decision-makers in such a judgement, six uncertainties should be computed: two related to the acceptance of the system, two linked with the nonacceptance of the system and two linked with the consideration of insufficient information.
An essential step is also to evaluate the decision-maker’s utilities. Decision-maker’s utilities reflect the consequences, expressed typically as costs, connected to each of the former listed uncertainties. Determining an individual’s utilities typically comprises expressing preferences among different options [30, 31, 32, 33].
Figure 2 shows a decision diagram for safety assessment. Rectangles stand for decision node. The decision-maker choices are as follows:
—Judge the system compliant.
—Judge the system as noncompliant.
—Judge the information insufficient.
Circles are random nodes representing the “states of nature”, where:
represents that the system is actually compliant.
represents a NOT compliant system.
The uncertainties about the system states are dependent on the data and information “available and will be calculated in subsequent sections of the chapter.
The paths in the tree correspond to the likely outcomes following the actions by the decision-maker. Six outcomes are considered:
: The system is affirmed compliant and it is so.
: The system is affirmed compliant though it is not.
: The system is affirmed NO compliant while it is truly trustable.
: The system is affirmed NO compliant and it actually is so.
: Although the system is truly compliant, it is not enough to make a decision.
: It is not enough to make a decision.
The rightmost end of the tree indicates the decision-maker’s utilities for each of the six branches. Each pair determines a consequence of decision-making. The utility which is defined on can be expressed as and defines the preferences of the decision-maker.
If action is taken, larger compliance is preferred over a smaller one:
If action is taken, diverse preferences can be outlined.
a. After a nonacceptance decision, the actual state of the system becomes irrelevant. This situation is equivalent to a constant utility for each value of , i.e.
b. The combination of a nonacceptance decision and low system compliance is perceived as an opportunity loss. With decision-maker loses the occasion to admit a trustworthy system. The utility function would not be constant any more, and smaller values of the actual system compliance would be preferred over larger opportunity loss). if and only of .
Despite the precise forms of and , there is an “equilibrium” value such as
Therefore, the utility functions must follow the following relations:
A decision-maker should choose the action that maximises the predictable utility . He should choose the action such that satisfy the following expression:
4. Quantification of the uncertainties
Safety compliance has been allocated a probability of truth or falsity. This probability corresponds to the decision-maker uncertainty (or state of knowledge), about safety compliance being true.
This probability is, namely, the uncertainty on the state of nature of the system compliance considering previous knowledge and information which is expressed as where a proposition “Data” stands for data, while “Inf” stands for background information. This section details how are calculated.
The proposed structure subscribes the concept that probability is not a frequency, rather a measure of uncertainty, belief or a state of knowledge. That is, probability allows doing credible thinking in situations where reasoning with certainty is not possible.
The result is the predictive probability that the system meets the safety objectives for what it has been designed, considering the envelope of data, knowledge and information gathered about the system during its design, production and operation.
To that aim, compliance assessment is redefined as the calculation of the degree of belief in the fulfilment of the applicable SO by the candidate system. The system is considered compliance if all the rate of failures satisfy their pertinent safety objective
Here the basis of Bayesian theory is applied to obtain an improved estimation of the system’s components rate of failure .
Let us define a set of propositions, each one with a probability stating the grade of confidence in its states, being these states either TRUE if is lower than its safety objective, , or FALSE otherwise.
This grade of confidence is denoted as a conditional probability. Each conditional probability mirrors our grade of assurance in satisfying its mandatory safety objective, .
The grade of assurance in the system compliance will be evaluated as the intersection of the belief of compliance of all particular failure conditions:
Uncertainties about the magnitude of the variables that govern the stochastic performance of the system are random variables which follow particular probability functions (pdfs). Consequently, rates of failure become, therefore, also random variables. Therefore, safety assessments are reduced to the determination of the failure rate pdfs.
For straightforwardness, we adopt probability function for the failure rate of a component, , conditional upon one or more unknown parameters . Other indicators could be selected instead, for example, the delay time between defect and failure or the number of failures in a period of time, but the theory hereafter applies equally.
The corresponding probability function is indicated as . To some extent previous knowledge about the expected values of should impact decisions about the system acceptability. However, is commonly unknown, and is not known unambiguously, so it cannot be used directly in making such decisions about system acceptance. is usually approximated by estimatingover data and supposing the parameters are equal to estimates.
The Maximum likelihood estimate (MLE) is
In practical applications, previous inequality is usually strict, and a single maximum exists. The classical approach to inference now substitutes by the first-order approach. In this case, as few data are available; this approximation would be very poor:
Decisions concerning compliance assessment, which seek for unknown values of , might alternatively be resolved conditional upon the observation, information, data or available knowledge, rather than on the unknown parameters. This allows to base decisions upon instead on , provided that Data and Inf are known.
The conditional probability distribution describes then the uncertainty in the parameter under study () considering observed data “Data” and the prior understanding of the system Inf. It denotes the sample of the rate of failure distribution, conditional upon the observed data, and it is exactly the magnitude required for the decision-making process, with no approximation. is calculated using the Bayes’ theorem:
is referred to the posterior distribution. All inference regarding will be derived from the posterior distribution.
corresponds to the likelihood distribution, at times mentioned as sampling.
is the prior distribution.
is the marginal probability.
Epistemic uncertainty is incorporated through the prior distribution . It synthesises the level of confidence in our model parameters , and it expresses experts’ preliminary state of information or knowledge. The prior distribution might be informative or non-informative.
The first ones deliver important information about the unquantified parameters. They are the way to capture past data and expert knowledge into a probability distribution and incorporate them into the model. Conjugate priors streamline the assessment of the preceding equation and permit analytical resolutions. However, prior can follow any distribution, and the preceding equation can be solved using numerical integration.
Non-informative priors are sometimes named as flat priors, vague priors, diffuse priors or reference priors. They are used when there is just very little background information about the parameters.
Most of the times, the Bayesian method requires numerical simulation because of the complexity of the distributions involved. That implies that the solution of Eq. (27) has to be obtained by numerically Markov chain Monte Carlo (MCMC) simulation [36, 37].
The resulting posterior distribution, , stands for updated knowledge about and, as stated before, will be the foundation for all inferential conclusions regarding
The distribution signifies the aleatory uncertainties or change naturally included in data and models. It also accounts for inefficiencies in the data assembly as well as inadequacies in the models. Likelihood function most commonly employed in system safety assessment are binomial, Poisson or exponential ones [38, 39, 40].
And finally is just a normalisation factor.
can be obtained from the posterior distributions through the marginalisation of the parameter , as shown in the next equation:
Eq. (12) calculates an average of the model uncertainty through the integration of the sampling through the posterior distribution . The outcome is a predictive probability of a failure rate meeting its safety objective.
The safety assessment is a methodical and prescribed procedure applied by ANSP to find, quantify and diminish risks in ATM systems and ensure that new services or systems reach assurance levels required by the aviation authorities. The assessment of safety compliance against approved safety levels becomes the last but essential part of the safety assurance process.
Nevertheless, this method is still exhibiting a series of limitations, the most important being its failure to cope with the uncertainty intrinsic in each step of the assessment and its lack of ability to deal with the lack of data in early stages of operation, and only small measurable information about its performance can be accessed. While most choices in the safety assessment involve a trial under uncertainty, the present system safety assessment process does not embrace any organised process or help to address all these uncertainties. So, the process misses the simplicity and impartiality essential for regulatory decision-making.
This chapter discussed the mathematical grounds for a cohesive Bayesian inference methodology, to assess and evaluate compliance with system safety goals and requirements, taking into account the uncertainty in performances. This work proposes a Bayesian structure that assesses safety compliance as a decision-making issue taken place under the presence of uncertainty.
Bayesian approach enables more comprehensive management of the uncertainties inherent to all system safety assessments and improves impartiality and accepting of compliance decisions and judgements, particularly in the cases where uncertainty is a limitation. This method might be applied to any safety or regulatory compliance process. It might be directly implemented by either operator or manufacturers, as well as by safety oversight authorities.
This work aims to increase the use of statistical Bayesian methods in the ground of aviation safety compliance assessment, up to a level equivalent to the one achieved so far in other critical industries, such space or nuclear power industries. The method offers a significant improvement to how ANSP presently take on regulatory safety compliance. Whereas the theoretical grounds are not new, their application to aviation signifies a noteworthy progression over current practices.
Conflict of interest
The authors declare no conflict of interest.