Open access peer-reviewed chapter

A New Non-Parametric Statistical Approach to Assess Risks Associated with Climate Change in Construction Projects Based on LOOCV Technique

By S. Mohammad H. Mojtahedi and S.M. Mousavi

Submitted: November 12th 2010Reviewed: June 2nd 2011Published: July 28th 2011

DOI: 10.5772/21362

Downloaded: 1566

1. Introduction

During the last two decades, Iran government has implemented a major program to extend and upgrade construction projects in oil and gas industry. In conjunction with the increasing growth, there are many types of potential risks that affect the construction projects. Risks can be defined as an uncertain event or condition that has a positive or negative effect on project objectives, such as time, cost, scope, and quality (Caltrans, 2007; PMI, 2008). Thus, there is a need for a risk management process to manage all types of risks in projects. Risk management includes the processes of conducting risk management planning, identification, analysis, response planning, monitoring, and control on a construction project. Risk management encourages the project team to take appropriate measures to: (1) minimize adverse impacts to the project scope, cost, and schedule (and quality, as a result); (2) maximize opportunities to improve the project’s objectives with lower cost, shorter schedules, enhanced scope and higher quality; and (3) minimize management by crisis (Caltrans, 2007).

In project risk management, one of the major steps is to assess the potential risks (Ebrahimnejad et al., 2009, 2010; Makui et al., 2010; Mojtahedi et al., 2010). The risk assessment process can be complex because of the complexity of the modeling requirement and the often subjective nature of the data available to conduct the analysis in construction projects. However, the complexity of the process is not overwhelming and the benefits of the outcome can be extremely valuable (Mousavi et al., 2011).

Many decisions come with a long-term commitment and can be very climate sensitive. Examples of such decisions include urbanization plans, risk management strategies, infrastructure development for water resource management or transportation, and building design and norms. These decisions have consequences over periods of 50–200 years. Urbanization plans influence city structures over even longer timescales. These kinds of decisions and investments are also vulnerable to changes in climate conditions and sea level rise. For example, many building are supposed to last up to 100 years and will have to cope in 2100 with climate conditions that, according to most climate models, will be radically different from current ones. So, when designing a building, architects and engineers have to be aware of and account for future changes that can be expected (Hallegatte, 2009)

Not considering of the climate change impacts on projects, especially those are established for a long term use, can cause massive costs for government and public in future. Nicholls et al. (2007) showed that, in 2070, up to 140 million people and more than US$ 35,000 billion of assets could be dependent on flood protection in large port cities around the world because of the combined effect of population growth, urbanization, economic growth, and sea level rise.

Recently, resampling techniques are rapidly entering mainstream data analysis; some statisticians believe that resampling procedures will supplant common nonparametric procedures and may displace most parametric procedures (e.g., Efron and Tibshirani, 1993). These techniques are the use of data or a data gathering mechanism to produce new samples, in which the results can be examined in various fields. In resampling, estimates of probabilities are offered by numerical experiments. Resampling offers the benefits of statistics and probability theory without the shortcomings of common techniques. Because it is free of mathematical formulas and restrictive assumptions. In addition, it is easily understood and computer user friendly (Simon and Bruce, 1995; Tsai and Li, 2008). The purpose of resampling techniques is to find the distribution of a statistic by repeatedly drawing a sample, thus making use of the original sample. The leave-one-out-cross-validation (LOOCV) first originated as generic nonparametric estimators of bias and standard deviation (SD). Moreover, to the best of our knowledge, no LOOCV technique and resampling application was found regarding climate change risk assessment of these projects. On the other hand, a risk data analysis in construction projects often encounters the following situations (Mojtahedi et al., 2009):

  • It cannot be answered in a parametric framework.

  • It may need to be examined by standard and existing tools.

  • It can be assessed only by specially tailored algorithms.

For these reasons, the LOOCV resampling approach is presented to use for assessing risks in construction projects. This approach is flexible, easy to implement, and applicable in non-parametric settings. In this paper, we contribute to this area by providing an effective framework for the application of the LOOCV to climate change risk data obtained from experts’ judgments in construction projects.

The chapter is organized as follows: In Section 2, the researchers review related literature and discuss the existing gap in the field. In Section 3, we describe the proposed a new non-parametric LOOCV approach to assess risks associated with climate changes in construction projects. In Section 4, computational results in construction of a gas refinery plant as a case study is presented. The discussion of results is given in Section 5. Finally, conclusion is provided in Section 6.

2. Literature review

Construction projects are subject to many risks due to the unique features of construction tasks, such as long period, complicated processes, undesirable environment, financial intensity and dynamic organization structures (Zou & Zhang, 2009), and such organizational and technological complexity generates enormous risks. The diverse interests of project stakeholders on a construction project further exacerbate the changeability and complexity of the risks (Zou & Zhang, 2009).

The purpose of project risk management is to identify risky situations and develop strategies to reduce the probability of occurrence and/or the negative impact of risky events on projects. In practice, project risk management includes the process of risk identification, analysis and handling (Gray & Larson, 2005). Risk identification requires recognizing and documenting the associated risk. Risk analysis examines each identified risk issue, refines the description of the risk, and assesses the associated impact. Finally, risk handling/response identifies, evaluates, selects, and implements strategies (e.g., insurance, negotiation, reserve, etc.) in order to reduce the likelihood of occurrence of risk events and/or lower the negative impact of those risks to an acceptable level. The risk-handling process contains the documentation of which actions should be taken, when they should be taken, who is responsible, and the associated handling costs (Fan et al., 2008).

It is widely accepted that construction project’ activity is particularly subject to more risks than other business activities because of its complexity, and a wide range of risks associated with construction businesses have been previously identified. A typical classification of risks includes technical risks, management risks, market risks, legal risks, financial risks, and political risks (Shen, 1997).

Identified risks are assessed to determine their likelihood and potential effect on project objectives, allowing risks to be prioritized for further attention. The primary technique for this is the Probability–Impact matrix, where the probability and impacts of each risk are assessed against defined scales, and plotted on a two-dimensional grid. Position on the matrix represents the relative significance of the risk, and high/medium/low zones may be defined, allowing risks to be ranked (Hillson, 2002). While it is not practical to discuss the full implications of all the risks identified in the survey, this section intends to demonstrate the pattern of the risk environment by presenting some practical examples discussed in the five in-depth interviews following the survey. Not all the risks addressed in this section respond to the ‘‘most important risks’’ ranked in the risk significance index as interviewees have different experiences, and their perception or judgment may not be fully in harmony with the calculated average index scores (Shen et al., 2001).

Previous studies have been focused on the risk management in mega projects. Grabowski et al. (2000) discussed the challenges of risk modeling in large-scale systems, and suggested a risk modeling approach that was responsive to the requirements of complex, distributed, large-scale systems. Florice & Miller (2001) showed that achieving high project performance requires strategic systems that are both robust with respect to anticipated risks and governable in the face of disruptive events by comparing the features and performance of three common types of project. Miller & Lessard (2001) developed strategies to understand and manage risks in large engineering projects. Wang et al. (2004) tried to identify and evaluate these risks and their effective mitigation measures and to develop a risk management framework which the international investors/ developers/ contractors can adopt when contracting large construction projects’ work in developing countries.

Iranmanesh et al. (2007) proposed a new structure called RBM to measure the risks in EPC projects. By combining risk breakdown structure with work breakdown structure (WBS), a new matrix (RBM) is constructed. Hastak & Shaked (2000) presented a risk assessment model for international construction projects. The proposed model (ICRAM-1) assists the user in evaluating the potential risk involved in expanding operations in an international market by analyzing risk at the macro (or country environment), market, and project levels. Zeng et al. (2007) proposed a risk assessment model based on modified analytical hierarchy process (AHP) and fuzzy reasoning to deal with the uncertainties arising in the construction projects. Mojtahedi et al. (2008) presented a group decision making approach for identifying and analyzing project risks concurrently. They showed that project risk identification and analysis can be evaluated at the same time. Moreover, they applied the proposed approach in one mega project and rewarding results were obtained. Ebrahimnejad et al. (2008) introduced some effective criteria, and attributes was used for risk evaluating in construction projects. They presented a model for risk evaluation in the projects based on fuzzy MADM. Makui et al. (2010) presented a new methodology for identifying and analyzing risks of mega projects (oil and gas industry) concurrently by applying fuzzy multi-attribute group decision making (FMAGDM) approach. Risk identification and classification is the first step of project risk management process, in which potential risks associated with an EPC project are identified. Numerous techniques exist for risk identification, such as brainstorming and workshops, checklists and prompt lists, questionnaires and interviews, Delphi groups or NGT, and various diagramming approaches such as cause-effect diagrams, systems dynamics, influence diagrams (Chapman, 1998; Ebrahimnejad et al., 2008, 2010; Mojtahedi et al., 2009, 2010). There is no a ‘‘best method’’ for risk identification, and an appropriate combination of techniques should be used. As a result, it may be helpful to employ additional approaches to risk identification, which were introduced specifically as broader techniques in group decision making field (Hashemi et al., 2011; Makui et al., 2010; Mousavi et al., 2011; Tavakkoli-Moghaddam et al., 2009).

There has been an increasing agreement that many decisions relating to long term investments need to take into account climate change. But doing so is not easy for at least two reasons. First, due to the rate of climate change, new infrastructure will have to be able to cope with a large range of changing climate conditions, which will make design more difficult and construction more expensive. Second, the uncertainty in future climate makes it impossible to directly use the output of a single climate model as an input for infrastructure design, and there are good reasons to think that the required climate information will not be available soon. Therefore, Instead of optimizing based on the climate conditions projected by models, future infrastructure should be made more robust to possible changes in climate conditions. This aim implies that users of climate information must also change their practices and decision making frameworks, for instance by adapting the uncertainty management methods they currently apply to exchange rates or R&D outcomes.

Water resource management is one of the most important fields which has attracted a lot attention. Qin et al. (2008) developed an integrated expert system for assessing climate change impacts on water resources and facilitating adaptation. The presented expert system could be used for both acquiring knowledge of climate change impacts on water resources and supporting formulation of the relevant adaptation policies. It can also be applied to other watersheds to facilitate assessment of climate change impacts on socio-economic and environmental sectors, as well as formulation of relevant adaptation policies. Yin (2001) developed an integrated approach based on the AHP for evaluating adaptation options to reduce climate change effects on water resources facilities.

There are many studies of climate change impacts and the relevant policy responses. For instance, Yin & Cohen (1994) developed a goal programming approach to evaluate climate change impacts and to identify regional policy responses. Huang et al. (1998) proposed a multi-objective programming method for land-resources adaptation planning under changing climate. Smith (1997) proposed an approach for identifying policy areas where adaptations to climate change should be considered. Lewsey et al. (2004) provided general recommendations and identified challenges for the incorporation of climate change impacts and risk assessment into long-term land-use national development plans and strategies. They addressed trends in land-use planning and, in the context of climate change, their impact on the coastal ecosystems of the Eastern Caribbean small islands. They set out broad policy recommendations that can help minimize the harmful impacts of these trends. Teegavarapu (2010) developed a soft-computing approach and fuzzy set theory for handling the preferences attached by the decision makers to magnitude and direction of climate change in water resources management models. A case study of a multi-purpose reservoir operation is used to address above issues within an optimization framework.

The review of the literature indicates that risk and uncertainty associated with climate changes in construction projects in the developing countries, particularly in Iran, has not been received sufficient attention from the researchers. In addition, climate change risk assessment in construction projects has been focused within a framework of parametric statistics. Among the techniques used in these studies, such as the multi criteria decision making or mathematical modeling, most researchers have assumed that the parameters for assessing risks are known and that sufficient sample data are available. Moreover, parametric statistics, in which the population was assumed to follow a particular and typically normal distribution, was used. However, in risk assessment of construction projects, particularly in developing countries such as Iran, this assumption cannot be made either because of a shortage of professional experts or due to time constraints. Hence, large-sample techniques are not often functional in such projects. Non-parametric cross-validation resampling approach is presented to utilize for assessing risks associated with climate changes in construction projects. This approach is flexible, easy to implement, and applicable in non-parametric settings.

This paper assumes that the risk data distributions in the construction projects are unknown. We cannot find enough professional experts to gather adequate data, and questioning experts about project risk to gather data is a time-consuming and non-economical process. Moreover, few experts are interested in answering or filling out questionnaires. Hence, this paper presents a non-parametric resampling approach based on cross-validation technique to overcome the lack of efficiency of existing techniques and to apply small data sets for risk assessment in the construction projects.

Theoretical studies and discussions about the cross-validation technique under various situations can be found, in (Stone, 1974, 1977; Efron, 1983). The cross-validation predictive density dates at least to (Geisser and Eddy, 1979). Shao (1993) proved with asymptotic results and simulations that the model with the minimum value for the LOOCV estimate of prediction error is often over specified. Sugiyama at al. (2007) proposed a technique called importance weighted cross validation. They proved the almost unbiased even under the covariate shift, which guarantees the quality of the technique as a risk estimator. Hubert & Engelen (2007) constructed fast algorithms to perform cross-validation on high-breakdown estimators for robust covariance estimation and principal components analysis. The basic idea behind the LOOCV estimator lies in systematically recomputing the statistic estimate leaving out one observation at a time from the sample set. From this new set of observations for the statistics, an estimate for the bias and the SD of the statistics can be calculated. A non-parametric LOOCV technique provides several advantages over the traditional parametric approach as follows: This technique is easy to describe and apply to arbitrarily complicated situations. Furthermore, distribution assumptions, such as normality, are never made (Efron, 1983). The cross-validation has been used to solve many problems that are too complicated for traditional statistical analysis. There are numerous applications of the LOOCV in the various fields (Bjorck et al., 2010; Efron & Tibshirani, 1993).

3. Proposed approach for construction projects

The objectives of this section are as follows: (1) establish a project risk management team, (2) identify and classify potential risks associated with climate changes in construction projects in Iran, (3) present a statistical approach for analyzing the impact of risks using a non-parametric LOOCV technique, and (4) test the validity of the proposed approach.

We implement the proposed approach in the risk assessment of the real-life construction project in Iran. This construction project in oil and gas industry is considered. The project is subject to numerous sources of risks. Designing, constructing, operating, and maintaining of the project is a complex, large-scale activity that both affects and is driven by many elements (e.g., local, regional, political entities, power brokers, and stakeholders). We aim at assessing the climate change risks in order to enable them to be understood clearly and managed effectively. There are many commonly used techniques for the project risk identification and assessment (Chapman & Ward, 2004; Cooper et al., 2005). These techniques generate a list of risks that often do not directly assist top managers in knowing where to focus risk management attention. The analysis can help us to prioritize identified risks by estimating common criteria, exposing the most significant risks. Hence, in this paper a case study which can assess risks of climate changes in a non-parametric statistical environment is introduced.

Data sizes of construction project risks are often small and limited. In addition, there are no parametric distributions on which significance can be estimated for risks data. On the other hand, the LOOCV is the powerful tool for assessing the accuracy of a parameter estimator in situations where traditional techniques are not valid. Moreover, the LOOCV technique is computationally less costly when the sample size is not large (Efron, 1983). A major application of this approach is in the determination of the bias. It answers some questions, such as what is the bias of a mean, a median, or a quantile. This technique requires a minimal set of assumptions.

In the light of the above mentioned issues, in this section one practical approach is proposed to use in assessing risks for construction projects in three phases. Establishing a project risk management team is considered in the first phase which is called phase zero. In this phase, organizational and project environmental in which the risk managing is taking place are investigated. After constructing the project risk management team, we construct the core of the proposed approach in the next two phases. Phase one in turn falls into two steps. In the first step, risk data of construction projects are reviewed in order to identify them. In the second step, the risk breakdown structure (RBS) is developed in order to organize different categories of the project risks. Phase two of the proposed approach falls into four steps. These steps are as follows: (1) determine descriptive scales for transferring linguistic variables of probability and impact criteria to quantitative equivalences, (2) filter the risks at the lowest level of the RBS regarded as initial risks, (3) classify the identified climate change risks (initial risks) into the significant and insignificant risks, and (4) apply the non-parametric LOOCV technique for final ranking. This phase attempts to understand potential project problems after identifying the mega project risks. Risk assessment is considered in this phase. The proposed mechanism for construction projects is depicted in Fig. 1.

Figure 1.

Proposed non-parametric statistical approach for risk assessment in construction projects.

3.1. Principles of the LOOCV

  1. In the first step, principles of non-parametric cross-validation technique are described in order to resample project risks data from original observed risks data.

  2. In the second step, the cross-validation principle for estimating the SD of risk factors (RFs) is demonstrated in order to compare cross-validation resampled risk data with original observed project risks data.

Based on the first step of proposed approach, the cross-validation technique is a tool for uncertainty analysis based on resampling of experimentally observed data. Application of the cross-validation is justified by the so-called ‘‘plug-in principle’’, which means to take statistical properties of experimental results (=sample) as representative for the parent population. The main advantage of the cross-validation is that it is completely automatic. It is described best by setting two ‘‘Worlds’’, a ‘‘Real World’’ where the data is obtained and a ‘‘Cross-validation World’’ where statistical inference is performed, as shown in Fig. 2. The cross-validation partitions the data into two disjoint sets. The technique is fit with one set (the training set), which is subsequently used to predict the responses for the observations in the second set (assessment set).

Cross-validation techniques an intuitively appealing tool to calculate a predicted response value is to use the parameter estimates from the fit obtained with the entire data set with the exception of the observation to be predicted. This predicted response value of the yivalue is denoted byy^i(i=1, 2,..., n). The LOOCV estimate of average prediction error is then computed using this predicted response value as:

Δ^CV,1=n1i=1n(yiy^i)2.E1

Figure 2.

Fig. 2. Schematic diagram of the cross-validation technique.

Generally, in K-fold cross-validation, the training set omits approximately n/Kobservations from the training set. To predict the response values for the kth assessment set,Sk,a, all observations apart from those in Sk,aare in the training set,Sk,t. Sk,tis used to estimate the model parameters. The K-fold cross-validation average prediction error computed as:

Δ^CV,K=n1i=1n(yiy^(k,t))2,E2

wherey^(k,t)is the ith predicted response fromSk,a(Wisnowski et al., 2003).

K-fold cross-validation: This is the algorithm in detail:

  • Split the datasetDNinto k roughly equal-sized parts.

  • For the kth part k=1,…,K, fit the model to the other K-1 parts of the data, and calculate the prediction error of the fitted model when predicting the kth part of the data.

  • Do the above for k=1,…,K and combine the K estimates of prediction error.

Let k(i)be the part of DNcontaining the ith sample. Then the cross-validation estimate of the MSE prediction error is:
MSECV=1Ni=1N(yiy^ik(i))2,E3

where y^ik(i)denotes the fitted value for other ith observation returned by the model estimated with the k(i)th part of the data removed.

Leave-one-out cross-validation (LOOCV): The cross-validation technique where K=N is also called the leave-one-out algorithm. This means that for each ith sample, i=1,…, N.

  • Carry out the parametric identification, leaving that observation out of the training set.

  • Compute the predicted value for the ith observation, denoted by y^ii

The corresponding estimate of the mean squared error (MSE) is:

MSEloo=1Ni=1N(yiy^ii)2.E4

The LOOCV often works well for estimating generalization error for continuous error functions such as the mean squared error, but it may perform poorly for discontinuous error functions such as the number of misclassified cases.

3.2. The linear case: mean integrated squared error

Let us compute now the expected prediction error of a linear model trained on DNwhen this is used to predict for the same training inputs X a set of outputs ytsdistributed according to the same linear law but independent of the training output y. We call this quantity mean integrated squared error (MISE):

MISE=EDN,yts[(ytsXβ^)T(ytsXβ^)]=EDN,yts[(ytsXβ+XβXβ^)T(ytsXβ+XβXβ^)]=Nσw2+EDN[(XβXβ^)T(XβXβ^)].E5

Since

XβXβ^=XβX(XTX)1XTy=XβX(XTX)1XT(Xβ+w)=X(XTX)1XTw,E6

we have

Nσw2+EDN[(XβXβ^)2]=Nσw2+EDN[wTX(XTX)1XTX(XTX)1X]=Nσw2+EDN[tr(wTw)]=σw2(N+p).E7

Then, we obtain that the residual sum of squares SSEemp returns a biased estimate of MISE, that is

EDN[SS^Eemp]=EDN[eTe]MISE.E8

Replace the residual sum of squares with

eTe+2σw2pE9

4. Case study (onshore gas refinery plant)

In this section, the proposed approach based on non-parametric cross-validation technique is applied in the construction phase of an onshore gas refinery plant in Iran. The purposes of this case study are assessing the important risks of climate changes for the onshore gas refinery project.

Onshore gas refinery plants or fractionators are used to purify the raw natural gas extracted from underground gas fields and brought up to the surface by gas wells. The processed natural gas, used as fuel by residential, commercial and industrial consumers, is almost pure methane and is very much different from the raw natural gas.

South Pars gas field in one of the largest independent gas reservoirs in the world situated within the territorial waters between Iran and the state of Qatar in the Persian Gulf. It is one of the country’s main energy resources. South Pars gas field development shall meet the growing demands of natural gas for industrial and domestic utilization, injection into oil fields, gas and condensate export and feedstock for refineries and the petrochemical industries (POGC, 2010).

This study has been implemented into 18 phases of south pars gas field development in Iran. The location of the onshore refinery plant is illustrated in main WBS of South Pars Gas Field Development (SPGFD) in Fig. 3. The objectives of developing this refinery plant are as follows:

  • Daily production of 50 MMSCFD (Million Metric Standard Cubic Feet per Day) of natural gas

  • Daily production of 80,000 bls of gas condensate

  • Annual production of 1 million tons of ethane

  • Annual production of 1.05 million tons of liquid gas, butane and propane

  • Daily production of 400 tons of sulphur

Figure 3.

Location of the onshore refinery plant in South Pars Gas Field Development

The contract type of above mentioned project is MEPCC, which includes management, engineering, procurement, construction and commissioning. In MEPCC contract, the MEPCC contractor agrees to deliver the keys of a commissioned plant to the owner for an agreed period of time. The MEPCC way of executing a project is gaining importance worldwide. But, it is also a way that needs good understanding, by the MEPCC, for a profitable contract execution. The MEPCC contract, especially in global context, needs thorough understanding. The MEPCC must be informed of the various factors that impact on the process of work, the results and success or failure of the contract, in global arena. The MEPCC must have data and expertise in all the required fields.

In this paper, risks of climate changes are considered from general contractor’s (GC) perspective. The GC receives work packages from the owner and delivers them to subcontractors by bidding and contracting. This contractor is in charge of monitoring the planning, engineering, designing, and constructing phases. Moreover, the installation, leadership, and the payment of the subcontractors are burdened by the GC. The following risks of climate changes in Table 1 are identified by gathering historical information often performed in construction phase of gas refinery projects in Iran.

RiskDescription
1Sea level rise
2Flood
3Earthquake
4Bush fire
5Tsunami
6Sand storm
7Increased atmospheric CO2
8Precipitation patterns & amount
9Increased temperature
10Hurricane

Table 1.

Climate change risk description

4.1. Apply the proposed approach to assess the risks of climate changes

In this sub-section, we show how the proposed approach can be used in a risk assessment according to the lack of risk sample data and periodic features of the construction projects. Hence, the comparison of the mean and the SD between the original sample distribution and the cross-validation resampled distribution can produce a better result.

In a risk analysis, we consider two indexes, which are probability and impact. The probability of a risk is a number between 0-1; however, the impact of a risk is qualitative. Though, it should be changed to a quantitative number, just like probability, a number between 0-1. The definitions of two indexes are as follows:

  • Probability criterion: Risk probability assessment investigates likelihood that each specific risk will occur.

  • Impact criterion: Risk impact assessment investigates potential effect on a project objective such as time, cost, scope, or quality.

The RF is computed as follows (Chapman & Ward, 2004; Chapman, 2001):

RFij=Pij+Iij(Pij×Iij)E10

The RF, from (0) low to 1 (high), reflects the likelihood of a risk arising and the severity of its impact. The risk factor will be high if the likelihood of P is high, or the consequence I is high, or both. Note that the formula only works if P and I are on scales from 0 to 1. Mathematically it derives from the probability calculation for disjunctive events:

Prob (A or B) = Prob(A) + Prob(B)  Prob(A) * Prob(B)E11

Two events are said to be independent if the occurrence or nonoccurrence of either one in no way affects the occurrence of other. It follows that if events A and B are independent events, then Prob(A and B)= Prob(A)*Prob(B). Two events are said to be mutually exclusive if the occurrence of either one precludes the occurrence of the other, then Prob(A and B)=0.

As far as probability and impact of project risks are independent; therefore, the formula functions properly in risk analysis and is merely a useful piece of arithmetic for setting risk ranking and priorities. Ten different risks have been identified for which we consider ten probabilities and ten impacts each that form our sample. It means that according to Eq. (10) we have Pijwhich is the probability of the ith risk and jth observation and Iijwhich is the impact of the ith risk and the jth observation. It is worthy to mentioning that experts are asked to estimate the probability and impact of each risks in a scale of very low (VL) to very high (VH) based on Table 2 (Chapman, 2001), their estimation are gathered and provided in Table 3. Consequently, gathered data (linguistic variables) are converted to numerical value and results are shown in Table 4.

ScaleProbabilityImpact
TimeCostPerformance
Very Low (VL)< 10%< 1 week< 0.1 M USDFailure to meet specification clause
Low (L)10-30%1-5 weeks0.1-0.5 M USDFailure to meet specification clauses
Medium (M)31-50%5-10 weeks0.5-5 M USDMinor shortfall in brief
High (H)51-70%10-15 weeks5-20 M USDMajor shortfall in satisfaction of the brief
Very High (VH)"/ 70%"/ 15 weeks"/ 20 M USDProject does not satisfy business objectives

Table 2.

Measures of probability and impact (M USD: Million US Dollar).

RiskDMs
12345
PIPIPIPIPI
R1VHVHHVHVHHHVHMH
R2HMHHMHHVHMH
R3VHHHMHHMHHM
R4LMLHMLLHLM
R5MMLHLMMHLL
R6HHHMMVHHHMH
R7HVHHHVHVHVHVHVHH
R8VHHVHVHMHVHMHM
R9MHLHHHMMHVH
R10HHVHHMHHHHH
RiskDMs
678910
PIPIPIPIPI
R1HHVHHHHVHVHVHVH
R2MHVHMHHMHVHH
R3MMHHMHMHHM
R4LHHLLMMMML
R5MLMLMMLLHM
R6HMHVHMHHHHH
R7HHHVHMHVHHVHVH
R8VHMHHVHHMVHHH
R9MHHMMHHHMH
R10MMMHVHHMVHHH

Table 3.

Risk observed data presented by linguistic variables.

RiskDMs
12345
PIPIPIPIPI
R10.850.850.600.850.850.600.600.850.400.60
R20.600.400.600.600.400.600.600.850.400.60
R30.850.600.600.400.600.600.400.600.600.40
R40.200.400.200.600.400.200.200.600.200.40
R50.400.400.200.600.200.400.400.600.200.20
R60.600.600.600.400.400.850.600.600.400.60
R70.600.850.600.600.850.850.850.850.850.60
R80.850.600.850.850.400.600.850.400.600.40
R90.400.600.200.600.600.600.400.400.600.85
R100.600.600.850.600.400.600.600.600.600.60
RiskDMs
678910
PIPIPIPIPI
R10.600.600.850.600.600.600.850.850.850.85
R20.400.600.850.400.600.600.400.600.850.60
R30.400.400.600.600.400.600.400.600.600.40
R40.200.600.600.200.200.400.400.400.400.20
R50.400.200.400.200.400.400.200.200.600.40
R60.600.400.600.850.400.600.600.600.600.60
R70.600.600.600.850.400.600.850.600.850.85
R80.850.400.600.600.850.600.400.850.600.60
R90.400.600.600.400.400.600.600.600.400.60
R100.400.400.400.600.850.600.400.850.600.60

Table 4.

Table 4. Converted risk observed data.

A sampling distribution is based on many random samples from the population. In place of many samples from the population, create many resamples by repeatedly sampling with replacement from this one random sample. The sampling distribution of a statistic collects the values of the statistic from many samples. The cross-validation distribution of a statistic collects its values from many resamples. This distribution gives information about the sampling distribution. A set of n values are randomly sampled from the population. The sample estimates RF is based on the 10 values (P1,P2,...,P10)and(I1,I2,...,I10). Sampling 10 values with replacement from the set (P1,P2,...,P10)and (I1,I2,...,I10)provides a LOOCV sample (P1,P2,...,P10)and(I1,I2,...,I10). Observe that not all values may appear in the cross-validation sample. The LOOCV sample estimate RFis based on 10 cross-validation values (P1,P2,...,P10)and(I1,I2,...,I10). The sampling of (P1,P2,...,P10)and (I1,I2,...,I10)with replacement is repeated many times (say n times), each time producing a LOOCV estimateRF.

Call the means of these resamples RF¯to distinguish them from the mean RF¯of the original sample. Find the mean and SD of the RF¯in the usual way. To make clear that these are the mean and SD of the means of the cross-validation resample rather than the mean RF¯and standard deviation of the original sample, we use a distinct notation:

meanLOOCV=1nRF¯E12
SDLOOCV=1n1(RF¯meanLOOCV)2E13

Due to the fact that a sample consists of few observed samples, which is the nature of the construction projects, we use the LOOCV technique to improve the accuracy of the calculation of the mean and SD for the RF of the risks which may occur in a project.

4.2. Results

To do the resampling replications, we used resampling Stat Add-in of Excel software. We compare the original sample and LOOCV resample of the data provided by the Excel Add-in to see what differences it makes. In Table 5, the statistical data of the original sample is presented.

RiskP (mean)I (mean)RF (mean)P (SD)I (SD)RF (SD)
R10.7050.7250.9130.1640.1320.074
R20.5700.5850.8270.1750.1250.078
R30.5450.5200.7820.1460.1030.078
R40.3000.4000.5960.1410.1630.081
R50.3400.3600.5760.1350.1580.145
R60.5400.6100.8250.0970.1510.065
R70.7050.7250.9130.1640.1320.074
R80.6850.5900.8790.1890.1650.076
R90.4600.5850.7740.1350.1250.084
R100.5700.6050.8310.1750.1070.092

Table 5.

Statistical data of the original sample.

After LOOCV resample replications, we obtain the mean for P, I and RF, and the SD for them. The data are reported in Table 6.

RiskP (mean)I (mean)RF (mean)P (SD)I (SD)RF (SD)
R10.7280.7310.9180.0480.0520.024
R20.5540.5840.8190.0590.0480.027
R30.5470.5290.7870.0520.0290.035
R40.2930.4090.5980.0470.0420.037
R50.3310.3580.5710.0550.0500.034
R60.5530.5740.8140.0190.0540.025
R70.6930.7220.9100.0770.0490.034
R80.6720.5950.8720.0440.0580.016
R90.4690.5640.7700.0350.0290.020
R100.5590.5980.8230.0670.0300.043

Table 6.

Statistical data of the LOOCV resample.

Then MSE are calculated for all significant risks for P, I and RF, based on the LOOCV principles. Results are shown in Table 7.

RiskP MSEI MSERF MSELOOCV
R10.0420.0310.009
R20.0560.0250.010
R30.0350.0170.009
R40.0330.0420.014
R50.0370.0450.036
R60.0150.0490.009
R70.0580.0340.011
R80.0600.0540.010
R90.0310.0230.011
R100.0590.0230.019

Table 7.

Table 7. MSE calculation for risk data.

5. Discussion and test

In this section, according to the computational results in construction of the gas refinery plant, discussion and testing of the proposed approach are presented. Reduction of standard variations, MSE comparison and normality plot are the main topics of discussion.

5.1. Reduction of standard deviations

In Fig. 4, the SDs for higher risks of the project are reduced remarkably and it shows the efficiency of the proposed approach in the project risk assessment. The results show that the proposed approach is practical and logical for estimating the SD particularly in the construction projects.

Figure 4.

Standard deviation comparison between original sample and LOOCV resample

Comparison between the SD of the original sample and LOOCV resample in the RF point of view shows that, for instance the SD of risk 1 of the original sample is 0.074 where the SD of the same risk with LOOCV resample is 0.024. In other words, the SD has been reduced about 63% for this risk. These reductions can emphasize that the LOOCV is making a better result in accuracy of the RF for each risk in the construction projects. Then, the SD reduction rate is computed by:

SDRed%=SDOSDLOOCVSDO×100,E14

where SDReddenotes the rate of SD reduction through the LOOCV, SDOrepresents the SD for the original RF data sample and SDLOOCVindicates the SD for the LOOCV. The SD reduction rate is presented in Table 8 for each risk.

RiskSD Reduction %
PIRF
R170.5760.1963.36
R266.4061.8465.12
R364.2471.6755.56
R467.0474.1854.37
R559.3468.5476.61
R679.8664.0161.06
R753.0562.5554.67
R876.6064.6778.67
R973.7476.8176.56
R1061.7271.5553.24

Table 8.

Rate of SD reduction for each risk

5.2. MSE interpretation and comparison

RF Traditional and RF LOOCV are shown in Table 8; moreover, the absolute variances are calculated to illustrate that there is no significant difference between two techniques. Therefore, we should take advantage of MSELOOCV value for ranking project risks. For this purpose we ranked the risks based on MSELOOCV value in Table 7, smaller MSELOOCV means higher rank and priority of the project risk, results are shown in Tables 9 and 10. It is obvious that there is significant difference between traditional risk ranking techniques and ranking based on MSELOOCV. Based on traditional risk ranking technique, risk 1 (sea level rise) stands in the first priority, but based on MSELOOCV risk 3 (earthquake) stands in the first priority, which this results are much applicable in gas refinery construction projects in Iran.

RiskRFTraditionalRFLOOCVAbs(RFTraditional – RFLOOCV)
R10.9130.9180.005
R20.8270.8190.008
R30.7820.7870.005
R40.5960.5980.002
R50.5760.5710.005
R60.8250.8140.011
R70.9130.9100.003
R80.8790.8720.007
R90.7740.7700.004
R100.8310.8230.008

Table 9.

Comparison between traditional risk ranking and LOOCV ranking.

Traditional risk rankingRisk ranking based on MSELOOCV
R1R3
R7R6
R8R1
R10R2
R2R8
R6R7
R3R9
R9R4
R4R10
R5R5

Table 10.

Ranking comparison between traditional risk ranking and LOOCV ranking.

In the following, we compare the original sample (i.e., traditional approach) and cross-validation resample (i.e., cross-validation approach) in two respects: normal probability plot (NPP) and matrix plot (MP).

5.3. Normal probability plot and matrix plot

Normal probability plot: The NPP is a graphical presentation for normality testing; assessing whether or not a data set is approximately normally distribution. In other words, the NPP is a standard graphical display that can be used to see deviations from normality. The data are plotted against a theoretical normal distribution in such a way that the points should form an approximate straight line. Departures from this straight line indicate departures from normality. In other words, the NPP is a special case of the probability plot, for the case of a normal distribution. Two NPP are illustrated for RFs in the studied case in Fig. 5. Parts A and B of Fig. 5 are the NPP for the initial and final RFs, respectively. We notice that the NPP is basically straight.. Consequently, when the NPP is straight, we have evidence that the data is sampled from a normal distribution. Before running the LOOCV, the normality plot is drawn in part A, as it is clear, the data are not distributed straightly. The NPP is generally not normal. But, based on part B, the data are distributed closer to mean. Therefore, the normality degree in part B (after running the cross-validation) is higher than normality degree in part A (before running the cross-validation) or data has a distribution that is not far from normal.

To apply the LOOCV idea, we should start with a statistic that estimates the parameter, in which we are interested in. We come up with a suitable statistic by appealing to another principle that we often apply without thinking about it. In this sub-section, the proposed approach clearly shows that the distribution of the original samples do not exactly follows the normal distribution; however, the distribution of the LOOCV follows the normality when the LOOCV is applied for the construction project risks (see Fig. 5.). In comparison between the original samples and LOOCV resamples for different risks, it is evident that the LOOCV resamples are close to the normal distribution in comparison with the original samples of project risks.

Figure 5.

NPP for the RFs in the project.

Matrix plot: A MP is a kind of scatter plot which enables the user to see the pair wise relationships between variables. Given a set of variables Var1, Var2, Var3,.... the MP contains all the pair wise scatter plots of the variables on a single page in a matrix format. The matrix plot is a square matrix where the names of the variables are on the diagonals and scatter plots everywhere else. That is, if there are k variables, the scatter plot matrix will have k rows and k columns and the ith row and jth column of this matrix is a plot of Vari versus Varj. The axes and the values of the variables appear at the edge of the respective row or the column. One can observe the behavior of variables with one another at a glance. The comparison of the variables under study and their interaction with one another can be studied easily as depicted in Fig. 6 for the construction project. This is why the matrix plots are becoming increasingly traditional in the general purpose statistical software programs.

Figure 6.

MP for the RFs befor and after running the LOOCV in the project

The researchers have shown that resampling-based procedure based on the cross-validation) can be easily applied to risk assessment in construction projects. In this paper, routines for implementing the procedures described were calculated in Stat Add-in of Excel. Having considered all different aspects involved in the projects' characteristics, proposed LOOCV approach is very useful for risk assessment in these projects, because of the fact that it provides accurate calculation which was discussed in this section. To ensure the performance of the approach, the potential experts in the construction projects are requested to check the risk approach prepared by using non-parametric statistical technique for applicability, efficiency, and the overall performance of the approach. They confirmed the results of proposed approach in the real world of large-scale construction projects.

6. Conclusion

In this paper, we have attempted to introduce the effective framework based on LOOCV technique to the academia and practitioners in real –life situations. The cross-validation technique has been used subsequently to solve many other engineering and management problems that would be complicated for a traditional statistical analysis. In simple words, the cross-validation does with the computer what we would do in practice, if it was possible, we would repeat the experiment. Moreover, the LOOCV technique is extremely valuable in situations where data sizes are small and limited, which is often the real case in applications of project risk assessment. In the proposed model, the basic principle of the LOOCV technique was explained for analyzing risks where a particular family of probability distributions is not specified and original risk data sizes are small. In particular, we have explained the LOOCV principle for estimating the SD of RFs associated with climate change issues in the construction project. We have found that the LOOCV has greater accuracy for estimating the SD of RFs than estimating the SD from original risks data. SDs for RFs were remarkably reduced when the non-parametric LOOCV was applied. It has been found that the distribution of the original samples did not exactly follow the normal distribution; however, the distribution of the LOOCV followed the normality when the proposed approach was applied (normality checking). Then, the NPP was provided in order to compare the traditional ranking and the proposed ranking for the climate change risks. The related results demonstrated that the proposed approach could assist top managers to better assess the risks of climate changes in the gas refinery plant construction in Iran. In future research, we may work on comparison of different non-parametric resampling techniques on risk data of climate changes of construction projects.

© 2011 The Author(s). Licensee IntechOpen. This chapter is distributed under the terms of the Creative Commons Attribution-NonCommercial-ShareAlike-3.0 License, which permits use, distribution and reproduction for non-commercial purposes, provided the original is properly cited and derivative works building on this content are distributed under the same license.

How to cite and reference

Link to this chapter Copy to clipboard

Cite this chapter Copy to clipboard

S. Mohammad H. Mojtahedi and S.M. Mousavi (July 28th 2011). A New Non-Parametric Statistical Approach to Assess Risks Associated with Climate Change in Construction Projects Based on LOOCV Technique, Risk Management Trends, Giancarlo Nota, IntechOpen, DOI: 10.5772/21362. Available from:

chapter statistics

1566total chapter downloads

1Crossref citations

More statistics for editors and authors

Login to your personal dashboard for more detailed statistics on your publications.

Access personal reporting

Related Content

This Book

Next chapter

Towards Knowledge Based Risk Management Approach in Software Projects

By Pasquale Ardimento, Nicola Boffoli, Danilo Caivano and Marta Cimitile

Related Book

First chapter

The Role of Standardization in Improving the Effectiveness of Integrated Risk Management

By Carmen Nadia Ciocoiu and Razvan Catalin Dobrea

We are IntechOpen, the world's leading publisher of Open Access books. Built by scientists, for scientists. Our readership spans scientists, professors, researchers, librarians, and students, as well as business professionals. We share our knowledge and peer-reveiwed research papers with libraries, scientific and engineering societies, and also work with corporate R&D departments and government entities.

More About Us