Data analysis for three projects.
Large, complex and challenging engineering projects require extensive understanding and management of risk. How these risks are identified at the infancy of a project and subsequently mitigated throughout the project lifecycle is critical to successful delivery. Many projects begin with a comprehensive attempt to identify risks but lack the tools to manage and measure risk mitigation as the project progresses through the lifecycle causing the project to spiral out of control. This chapter outlines details of a risk model that uses a system within a system approach of identifying and segmenting risks. The model can then be analysed quantitatively and generate a visual lifecycle risk profile that allows the project team to monitor risks continuously in the project lifecycle. Furthermore, the use of a baseline or ideal project is proposed that is used as a measure of likely success against new projects.
- risk profile
- enterprise modelling
- project lifecycle assessment
- supply chain risks
Highly complex platform systems, such as ships, aircraft and land vehicles, present enormous technical and financial project challenges, and these include modifications and enhancements to their systems during their long service life. Many project management decisions are injudicious because they are made without a clear understanding of key risks and their consequences. This leads to budget overruns, schedule delays, system failures and ultimately disgruntled customers . When faced with managing complex projects, a strategy many engineering organisations tend to adopt is the development of their own bespoke risk handling methods which attempt to control project failures with varying degrees of success .
According to the international standard on risk management ISO 31000, risk is the possibility of an undesirable event happening . This definition is made up of two aspects, the ‘severity’ of the unpleasant something, and the probability or likelihood of this something actually ‘happening’. When undertaking extensive, highly complex and challenging projects it is essential that any large organisation identify and manage all risks that could preclude success.
Unfortunately, in reality it is all too common in industry for risk to be treated as an afterthought and even in some cases seen as a box ticking exercise. It is crucial that a strategy is developed for identifying, handling and mitigating risks to ensure the success of the project. How well this is achieved can make or break the project and even the organisation.
In some organisations, the use of risk management and analysis tools may be conducted by dedicated risk engineers who are trained in such practice. However, in doing their work, risk engineers still rely on either project managers or engineers working on the project to provide appropriate information for risk assessment. While these individuals (e.g. project managers and engineers) are no doubt well aware of possible risks relating to their project, capturing information about the risks is often subject to influences and consideration of the relationships among parties in the project. Consequently, outcomes can vary significantly among different groups . It is clear that a system to capture and analyse risks in a new project either prior to commencement or at early the stages of the project is desirable.
Risks in large engineering projects can come from many sources including uncertainties in the work which can influence and determine cost and time of execution. Essentially, every activity in the project has varying degrees of risk. Traditionally, the focus is on reliability, availability, maintainability and supportability (RAMS) . Engineering design professionals usually use methods such as failure mode, effects & critical analysis, fault tree analysis and event tree analysis to assess the performance in a quantitative value that is basically an indicator of risks . Modarres  went further to identify, rank and predict contributors to risk. Modarres calculated probabilistic risk for different scenarios and offered some interesting methods of presenting risk in graphical forms. This work illustrated ways of quantifying risks and hence the possibility of ranking accordingly. Ayyub  used a number of real-life examples to illustrate ways risk data can be manipulated or partially used to achieve useful outcomes. Claypool et al.  conducted surveys of risk management techniques with 110 managers, who believed there was room for improvement. They particularly highlighted that little work has been conducted into reducing risk in the supply chain which large scale engineering projects depend heavily upon.
Abi-Karam  studied design-build type of construction projects and identified the risks in the proposal, pricing, project schedule, performance measures, contractual liability and safety areas. These risks should be identified as thoroughly as possible and managed continuously even beyond project completion.
Through their lifecycle management strategy, many organisations operate a risk and opportunity management plan (ROMP) for business units, projects and functions. Risks and opportunities are inherent in all project and business activities. Therefore, it is the responsibility of staff to continuously manage these risks and to promote and realise any opportunities. It is important however, to recognise that the aim of ROMP is not to eliminate risk totally but to provide a systematic means to proactively control and direct the business in regard to mitigating risks and promoting opportunities, thus creating and protecting business value. It is noted that the content of risk level on a project, determined as a result of mapping to the standard ISO risk matrix , can vary between companies.
The common tool for risk management in large engineering organisations is a risk register that records the opinion of project managers, engineers and other key staff involved in the project. The process of generating this register is often very subjective, and the assessment team may include internal and external stakeholders such as customers and alliance partners who participate in workshops, brainstorming and project meetings.
Complex engineering projects are usually coordinated by systems engineering methodology. It can be seen that the systems engineering approach has a more comprehensive coverage of the project development context in relation to large complex projects using a systems engineering management plan (SEMP) . A typical engineering project in the defence environment for example will follow a SEMP process usually structured to include a number of mandatory stages and theoretical gates which need to be passed before the change can be progressed. One of the risk management core activities during the systems engineering project lifecycle process is incorporating risk identification, analysis and mitigations throughout the SEMP stages. Risk assessment of the whole lifecycle in the project development stage in many cases lacks project details/data and is not adequately addressed at the critical early stage.
This chapter examines the SEMP and investigates whether risk management of the project can be defined much earlier in the process. It focuses on developing a quantitative risk model that can identify risks, develop a risk profile that can be presented in a visual format and manage/track residual risks throughout a project’s life cycle. The potential development of a relationship between the model and a risk burndown chart, offers a means of associating the identified risks with both their predicted financial and schedule impacts and what affect proposed mitigations will achieve . The focus of this chapter is to assist large organisations to identify, visualise and manage risks throughout the lifecycle of a project. It should also be noted that some of the methods of calculation chosen are not mandatory and the proposed model has considered the need for flexibility to allow for alternative interpretations and weightings. The risk burndown chart would be an ideal tool for visualising how different strategies in the allocation of resources, financial investment, cash flow, technical challenge, etc. could affect risks . A risk model is regarded as useful if it can identify key risks from quantitative data and suggests possible mitigation strategies. While many organisations already attend to highlighting risks with an array of tools, software and/or process methods, their calibre is often diminished by over-complexity and convoluted processes that are too involved. Hence, probably the most important characteristics of a risk model should be simplicity and ease of use.
2. Risk assessment using a system model
The main problem with a risk register is the lack of a structured methodology for identifying risks and a systematic analysis process to determine and develop mitigation strategies for the complete engineering lifecycle. A generic enterprise model is necessary to provide a quantitatively generated risk profile . The model is not intended to compete or replace current risk theory, tools or processes already available but instead offer a novel and enhanced method for managing and importantly visualising risks throughout a project lifecycle.
The approach described in this chapter takes into account the fact that large complex projects have several distinct stages and can last for years. This consists of developing an understanding of the factors affecting outcomes based on the contractual environment involving key stakeholders in the industry, the customer and the community at large. The purpose of modelling is to develop a numerical indicator which can be used to ameliorate the current processes involved in understanding and managing risk throughout the project lifecycle due to changes of these factors.
In order to set some form of qualitative baseline which can then be used for both quantitative assessment and analysis, the 3PE model (People, Process, Product & Environment), described by Mo  (Figure 1) has been adopted to provide the system framework for investigating risks surrounding complex engineering projects .
In Figure 1, the physical elements of a system are shown as: the ‘product’ that is built from fundamental engineering sciences, this is the common view of most users and society in general. The ‘product’ is the tangible element that can usually give the ‘touch-and-feel’. In commercial sense, this is what the customer feels they are paying for.
Not everyone realises that the ‘people’ element is an integral part of the system. The element ‘people’ from the system’s point of view is not limited to the user. It includes all human participants who are involved, one way or another, to enable successful operation of the system to achieve its goals and applies to systems of any nature. In engineering projects involving design and build, the people are engineers, suppliers, technicians, managers, directors, stakeholders and customers from all organisations having an interest in the project.
To operate the ‘product’ properly, a set of procedures, i.e. ‘process’, should be defined and followed by everyone. The principles and practices of different people and organisations need to be synchronised and shared. This is the realisation of ‘practices’ that are accumulated, engineered and designed to generate knowledge of how to go about doing something and ensure success. A defined set of procedures not only allows the ‘people’ (remember there could be many people) to synchronise with the reactions of the system at different inputs during operation, but also ensures the system can overcome the challenges in operation.
Needless to say, these elements are interacting among themselves as shown by the double arrows between the elements in Figure 1. Without these interactions, the ‘product’ is not used by ‘people’, the ‘people’ do not follow the ‘process’ and the reaction of the ‘product’ is unpredictable. The outcome is obviously unsuccessful operational performance.
On top of this, while the three elements are working and interacting among themselves as a system, they co-exist within an ‘environment’. If the ‘environment’ is within the expectation of the system, the elements in the system can work and interact correctly and produce good system performance. On the contrary, if the ‘environment’ has changed to beyond extreme conditions, the system will fail. To overcome this problem, the system has to be changed, i.e. some or all of the elements have to be changed to adapt to the new ‘environment’. If nothing is done to the system while the ‘environment’ has changed, the system can potentially become out-of-date and/or obsolete.
In summary, the main elements in the 3PE model are people, process and product, which are located within an environment . For each of the elements, a list of generic risks, more or less common to all projects was developed which were subsequently reconfigured into baseline questions. Once the model is defined, risk elements can be assigned and assessed. The risk indicator of a project can be estimated from the 3PE model as a normalised distribution of risk, in this project it is denoted by N(μj, σj), where j is a particular project.
In order to evolve the risk model further, the theory of generating a percentage of success for a given project has been used. The hypothesis being that a ‘desirable’ project, would have minimal risk that could be easily mitigated and has a percentage of success which can be established as the benchmark (i.e. high percentage of success). The ‘desirable’ project is defined as a distribution N(μd, σd).
To calculate the risk of not achieving project success, the differential distribution will show the risk of the project in relation to the ‘desirable’ project. The mean and standard deviation can be calculated using equations:
The risk indicator at time of measurement is then defined as:
The ‘desirable’ project is a reference only, it may still have some risks, but they should be acceptable and manageable. Eq. (3) is showing the probability of any project having a probability of failure that is greater than the ‘desirable’ project.
3. Case studies
Three projects were chosen to illustrate the methodology of calculating the risk of a project. These projects were or are being executed in the defence environment but the general principles of computing risk indicator F applies to all large complex projects.
PL1. This project was completed on budget and schedule with successful commissioning on site and acceptance by the customer. This project is considered medium size and combined OEM equipment and a customised installation. The normalised distribution of risk in this project is denoted by N(μ1, σ1).
PL2. This project was conducted in an alliance between two large defence contractors and the government. In this project, the highest risk was the need for a new technology to be developed by one of the collaborating companies. Since the new technology has not been substantially applied in the mission environment, it is considered to be a significant risk surrounding the project. There was also considerable risk introduced by forming an alliance . The normalised distribution of risk in this project is denoted by N(μ2, σ2).
PL3. This project relates to the design, manufacture and installation of an enhancement for a specific class of naval ships. The size of the project is considered medium. Since the enhancement is not a complex item, the project is considered manageable as the design, fabrication and installation is to be fully controlled by the individual organisation. The normalised distribution of risk in this project is denoted by N(μ3, σ3).
Risks were extracted from each of these projects, with over 150 risks being identified. The compiled risks were then analysed for repeats and commonality within each of the elements and a list of common risks across the three projects established. To help focus the research in developing quantification methodology, 10 risks from each of the 3P categories (total 30 risks) were selected based on their generic nature and applicability to the majority of past projects. Project managers, engineers and key staff members who were involved in these projects were asked to rate these risks on a scale of 0–10, where 0 means no risk at all and 10 means extremely high risk. The summarised results of their ratings are shown in Table 1.
|Mean||Std. dev.||Mean||Std. dev.||Mean||Std. dev.|
It should be noted that a higher value in Table 1 indicates a higher risk ranking.
The three projects can be represented by a normal distribution profile as shown in Figure 2. The graph in Figure 2 can be interpreted as follows: Using PL2 as the example. The normal distribution of PL2 is represented as N(7.4000, 2.2084). According to statistical analysis, the probability of failure is the area under the curve. However, the whole PL2 distribution is in the positive side of the profile, clearly this does not mean that the probability of failure of PL2 is almost 100%. The level of risk needs to be indicated as a relative value to some reference point or project(s).
From the perceived understanding of the nature of the three projects, it is generally agreed that serious challenges relating to PL2 need to be overcome and it is therefore considered a ‘risky’ project. PL1 has actually been completed and generally considered a success, while the PL3 project is clear in scope and is found to sit somewhere between the two. It can be seen in Figure 2 that PL2 has a higher risk level than the other two projects (skewed more to the right). In order to develop the risk model further, the idea of comparing to a ‘Desirable’ or ‘Ideal’ project is explored.
As previously mentioned, the PL1 project is generally considered a successful project (PL1 was delivered within budget and on time. It was also viewed favourably by the customer as addressing their original need, hence why it is considered a successful project). It can therefore be judged that its data results must in some way align towards a ‘desirable’ project (i.e. minimal risk). To illustrate the computational process, the ‘desirable’ project could be defined from PL1 as an improvement of one ranking better than the PL1 data, i.e. minus one rating value. The outcome of this improvement gives a distribution N(μd, σd) as shown in Table 2. It should be noted that other methods of setting the benchmark ‘desirable’ project can be used, for example, survey a special expert group or find the ‘reasonably smooth running’ project. However, in the context of computing a risk indicator/gauge for these projects, the outcome does not affect the methodology discussed in this chapter.
Applying Eqs. (1) to (3), the differential risk profile of the three projects can now be calculated as shown in Table 3. The data can be represented by the risk profiles as shown in Figure 3. This set of risk profiles can be interpreted more rationally.
In Figure 3, using PL2 as the example, the differential normal distribution of PL2 against the ‘desirable’ project is computed by Eq. (1) and Eq. (2) as N(1.5381, 4.5260). According to Eq. (3), the probability of failure is the area under the curve at the right-hand side of the y-axis (i.e. x = 0). In this area, the risk rating of PL2 is higher than the corresponding risk rating of the ‘desirable’ project. The area is calculated as 63.3%. Likewise, the probability of failure of PL1 and PL3 can be computed as 58.6 and 57.3% respectively. Please note that since the ‘desirable’ project in this case is generated from ‘improving’ upon the values of the best of the three projects, all projects will obviously have a riskier profile than the ‘desirable’ project. It is theoretically possible to manage a project better than the ‘desirable’ project. In that case, the probability of failure can be reduced significantly, approaching but not reaching zero.
4. Lifecycle risk assessment
Following the Systems Engineering V lifecycle  is an important decision towards successful management of the system development process. However, each of the stages in the systems engineering cycle is still lengthy and laden with risks. The systems engineering approach manages projects by a process of decisions and/or milestones.
To maintain the momentum and ascertain the right direction of development, it is necessary to impose routine checks throughout the forward branch of the V cycle with different level reviews. Figure 4 shows a typical systems engineering management plan time line with different stages and reviews (sometimes known as ‘milestones’) clearly defined. Technically, the milestones serve as ‘gates’ that control the flow of the project. If the design or progress is not acceptable, the systems team is not allowed to start work towards the next review, hence it is a ‘go or no-go’ gate.
Please note that these reviews are the minimum number of checks that should be installed in the forward systems design cycle. More frequent, and less formal reviews can happen at any time and anywhere in the duration.
Initially Figure 4 looks complicated but it is basically a representation of all project activities in parallel. Starting from the top bar which represents the activities to prepare a site, this line of activities will go through three reviews: system requirements review (SRR), preliminary design review (PDR), and critical design review (CDR). The sub-system ‘user system’ on the next bar will go through four reviews: concept design review (CoDR), SRR, PDR, CDR. The sub-system of ‘integration and test facility’ are complimentary outcomes of the next few design activities and it will be synchronised with those activities according to the red links. The most important sub-systems are represented by the next three bars, in this case: high resolution receiver, power system and signal processing. These sub-systems require a final pre-production review (PRR) before going into actual manufacture. Two other reviews important to the cycle are the site acceptance test (SAT) and test readiness review (TRR). Both these tests will involve the customer and reference verification to the user requirements.
The estimated success probability of the project at milestone i can be estimated using equation:
Hence, the estimated failure probability in this milestone is given by:
To determine the change of probability in the lifecycle, each of the process stages (in theory) should reduce or mitigate project known risks . Therefore, if the project is progressing though the lifecycle as forecast, the risks of previous stages should be mitigated or resolved to the expected level. Post each stage, the project risk then comprises of the risks in achieving the remaining milestones. A way of representing this mathematically, is by the logic of compound events that are in series. If there are M milestones in the project, the project probability of success of the kth milestones is the product of probability of success of all milestones from milestone k:
Therefore, the risk of the project at milestone k is given by:
With the formula for estimating the risk of a project at the different milestones established, the next challenge is to determine how the success index and standard deviation of a complex project can be reasonably estimated. This leads finally to the calculation for the overall project success index, which is essentially a means of accurately estimating/forecasting based on a known range of required activities within the project.
As defined earlier, the 3PE model elements are people, process and product, located within an environment. A list of activities for each element and relevant interactions between these elements can be established and thus a holistic view of the project can be formulated. The organisations systems engineering team can then estimate the success index levels separately for each element. The sum of risks for each of the project stages can then be presented as a risk level or profile of the project. However, it should be noted that this profile represents a snapshot of the risks in the system at a single moment in time.
Any activity in a project can be assessed by combining the 3PE indices. The indices of activity j in milestone i can be denoted by μji, and σij. The estimated success index is now combined with both the desirable project mean index and the standard deviation. If normal distribution for all activity indices in milestone i is assumed, and there are α activities in the milestone, the resulting combined distribution of the milestone i is , where
The probability of success is calculated with Eq. (4) and the overall or holistic project risk for a milestone can then be computed by Eq. (7). Again, it is worth highlighting that the identified risks of previous stages should be resolved and/or reduced (preferable approaching zero) as the project progresses through the lifecycle. It is then possible to plot risk levels against the project stages to visually highlight the reduction.
5. Lifecycle risk assessment: worked example
For this worked example, the theoretical risk profiling methodology is applied to a logistics improvement project as an illustration of the process. The company employs approx. 3000 staff at the time of writing and was undergoing a series of significant changes. In order to stay competitive, the organisation is developing a transportation network that will support its just-in-time (JIT) supply chain. However, to apply Eqs. (3) and (4), a conceptual desirable project is necessary. From the authors’ previous experience on similar process improvement projects, it is desirable to set the mean success index and its standard deviation at sd = 4.5, and σd = 0.5 respectively. It should be noted once again that the desirable project values used in this example are for illustration purposes only. Other methods of setting the benchmark or desirable project can be used as explained earlier. The scale used in this research is linear from 0 to 10 with 0 being no chance of success (sure failure) to 10 (being sure success).
After integrating the project’s SEMP, the lifecycle process can be divided into six stages: (1) plan, (2) define, (3) preliminary design, (4) detailed design, (5) build, (6) deploy and close. At each milestone of the process, certain activities can be identified and these are aligned under the 3PE model elements as shown in Table 4. Each of the activities within the 3PE elements were assessed using a three-point estimate by the systems engineer of this project. The scale is the same as that used for the desirable project.
|Plan||Define||Preliminary design||Detailed design||Build||Deploy and close|
By applying Eqs. (1) and (2) to the quantified 3PE elements, the numbers can be converted to normal distributions. As an illustration, the Process element is used to demonstrate the analytic process as shown in Table 5.
|Plan||Software system methodology||3||5||6||5.500||0.316||1.000||0.418|
|Define||System of systems methodology||2||5||7||4.556||0.460||0.056||0.480|
|Strategic assumption surfacing and testing||3||4||5|
|Build variation and manual checking||3||5||6|
|Prel. design||Define problem context (system requirements)||3||5||6||4.333||0.422||−0.167||0.462|
|Hard system definition||3||4||6|
|System and user requirements||3||5||6|
|Detail design||Dynamic operation environment||1||3||4||3.667||0.380||−0.833||0.444|
|Applicability to wide variety of situations||3||5||6|
|Improved tracking of original orders||3||4||5|
|VSCM network design technology||2||3||4|
|Build||Life cycle costing||5||7||8||5.667||0.715||1.167||0.617|
|Logistics support analysis||3||5||6|
|Technical factors (operational effectiveness)||4||8||9|
|Key performance indicators||3||4||5|
|Economic factors (cost-effectiveness)||3||5||6|
|Deploy and close||Standard operating procedure||3||5||6||4.833||0.707||0.333||0.612|
Using the means of the 3PE elements at each milestone and compared to the conceptual desirable project represented by a normal distribution N(4.5, 0.5), the risk distribution of the SEMP cycle can be summarised in Table 6.
|Ri of element||Fi of element||Fk of element at milestone k|
|Product/people/process (all three)|
Combining the lifecycle risk of each 3PE element,, the risk levels at each milestone can be computed in Table 7.
The overall risk is plotted as a ‘risk burn down’ graph which highlights that initially at the planning stage of the project, there are many unknowns and uncertainties in the project (including those downstream) and the risk is considerably high. See Figure 5.
The risk computation as shown in Table 5 indicates that some activities are risky, e.g. ‘product’ (in this case, implementation of the JIT transportation network system) at the ‘deploy/close’ stage has high probability of failure. The main uncertainty at this stage of the project was ‘project management’. This is reflected by the fact that some resources have been redirected to other work when the JIT supply chain was close to completion.
The risk burndown plot can be used to verify the logical outcome of the project, i.e. as the project progresses through the lifecycle, many risks would have been resolved and thus the probability of failure or risk level should sequentially decrease. Failure of the project to follow this theory is an indication that the risks are not being managed as forecast.
This chapter combines the probability of success of every activity at each milestone of a project lifecycle based on a systems engineering process. The method uses an enterprise network model to study a manufacturing company’s risk of undertaking a change project to re-design its logistics system in terms of planning, monitoring and validating of the network efficiency and criticality. Several key changes have been put in place and their risks in the system’s lifecycle development are assessed within the enterprise network model to ensure greater probability of success.
The 3PE modelling framework provides a logical foundation for quantifying the risks of an engineering project. By assessing the expected level of achievable outcome in comparison against a ‘desirable’ project, this research has developed a novel method of generating a quantified risk indicator that can provide the basis for future planning improvements and hence the successful execution of complex engineering project. The use of a risk burndown plot allows an organisation to visualise the level of risk that has been burnt-down at each stage of lifecycle. In addition, the 3PE model provides a method for modelling different scenarios and the ability to assess their effectiveness at burning down risks.