Risk Management Techniques

The importance of risk management has been increasing for a lot of construction projects in different industries, and thus risk management department must be established to monitor the risks. The construction industry and its managers are exposed to a high degree of risk that leads to increasing the cost or delay in the projects. Therefore, there must be techniques used to control the risk and determine the best method to respond to it. Artificial intelligence and its techniques will be described includes the principle of and its advantages, types and the techniques that used for the classification that includes, decision tree and K-star, neural network and support vector machine and simulation techniques like system dynamic and also using optimization techniques, Particle swarm, Gravitational Search Algorithm as follows: Classification (decision tree, K-star, neural network, support vector. Machine and).


Introduction
The concept of risk management includes two parts: the first one is the management, and the meaning of management is planning, organizing, and protecting; while the other part which is risk is the variability of what is expected [1].
Risk management is defined as the process that is able to find the risks and analyze these risks using a suitable method and then put the appropriate response to eliminate those risks or reduce them, thereby increasing the success of the project and the achievement of its goals [2].
Risk management is also defined as the process that enables the analysis and management of risks related to the project and its aim is to reduce the risk that threatens the goal of the project and hence it takes the responsibility of increasing the opportunity for the competition of the project in time, cost, and quality [3]. Risk management techniques are considered to be very important and there are a lot of techniques, especially artificial intelligent techniques.
Artificial intelligence is defined as the process of studying systems which behave in an intelligent manner as an observer to another. AI includes the use of tools depending on the intelligent behavior of human beings and other animals too in order for the complex problems [4].
AI is interested in artifacts intelligent behavior, which includes understanding, thinking, learning, communicating, and working in environments which are complex. In general, the utmost objective of AI is the process of perceiving the development of tools and mechanisms that can behave as humans behave or even better. Another objective of AI can be known which can as comprehension behavior, whether it appears in machines or in humans that mean simulate the human's behaviors. Thus, AI contains both scientific and engineering objectives [5]. Various references discover in the scientific literature that artificial intelligence integrates with project management areas are based on the artificial intelligence, project success estimation, critical success factors identification, project budget Relatedness, project schedule connection planning of the project, and risk identification relatedness [6].
Artificial intelligence include the classification techniques.

Classification
Classification, as described in statistics and machine learning, is the identification of a group of categories (subpopulations) to identify a new category, depending on the o training set of data that have the same instance whose classes are already known. For instance, if an email is to be assigned as spam or nonspam or diagnosis of a specific patient by certain disease based on known features of the patient like gender, the symptoms that he has and blood pressure, in another word, can be said that classification is a symbol of pattern recognition [7].
Classification is the process of training the objective function f in which each attribute x is a map to class label y that is already known. Resulting in a group of records which is the training set (training set), I every and each record includes a collection of attributes, in which the class is among one of them [8].
The model classification can be used for the following.
I.Descriptive modeling: the model of classification can work as a caption tool to show the difference between different classes with the same objects.
II.Predictive modeling: in this type of classification model, a label of the class that belongs to unknown data can be predicted [8], as shown in Figure 1.
The methods used in classification can be split into two categories as parametric and nonparametric problems. As a matter of fact, the basis of the parametric method is on the assumptions that the population normally distributed and the parameters are assessed to solve the problem [9]. On the other hand, there are no assumptions made about the distributions in the nonparametric methods and hence the distribution is free [10].

Classification techniques
The classification techniques are as follows.

Decision tree
The decision is supported by using a tool like a decision tree by using the graph as a tree or modeling a variety of decisions and their potential effects, which include several examples of the outcomes of a chance event, costs of the resource, and utility [11].
Decision trees are usually used in the research of operations, more specific in analysis of the decision, to assist in the identification of the most of the strategies used to reach a goal, but they are a very popular tool to use in machine learning [11].
The induction of the decision tree is to learn the decision trees from the row of training in the class that is already labeled. A decision tree as can be considered as a flowchart, for instance, tree structure, in which the internal node stands for test on an attribute, and the outcome of this test is represented by the node of the leaf containing a class label [12].

K-star algorithm
An instance-based classifier called K-star or K*, is a class of the instance in the test step depending on the similar instance in the training step, as found by the function of the similarity. The difference between this algorithm and other instance-based learners is that this one uses a distance based on the entropy function. Classification based on the instance-based learners is made by comparing the instance to database examples that are previously classified. The basic assumption is that instances with similar classifications will be similar too. "Similar instance" and "similar classification" can be defined as follows: the instance-based components are the distance function that identifies the similarity between the two instances, and the classification function assigns how the instance similarities yield a final classification to classify the new instance. The entropic measure is used in K-star algorithm, depending on the likelihood of an instance transforming randomly into another by selecting among all potential transformations. The distance of the instance by using entropy as a meter is very helpful and the distance between the instances is measured with the help of the information theory. The distance between the instances is actually represented by the transformation complexity of one instance into another. It is being accomplished into two parts: first, a limited group of transformations is identified, which will assign one instance into another. Later, one instance (a) is tranformed to another instance (b) with the help of programs in a limited transformations sequence, beginning at (a) and ending at (b). A collection of points that is unbounded is given and a group of transformations that previously defined T is defined; group T has the value of. This t will be assigned as t: I ! I. To assign instances with it, σ is used in the T (σ(a) = a). σ ending P, the group of all codes of the prefix from T*. Transformation on I is identified by members of T* and of P uniquely.
For the employment of the classifier for the instance-based of a that employs the distance measure of entropy, there is the necessity of a method to select values for the x0 parameters for attributes that are real and s for the attributes that are symbolic, as method employment the values returned by the measure of distance to give a prediction.
For every dimension, the selection of values must be made to the x0 parameters (for attributes that are real) and s (for attributes that are symbolic). The distance measure attitude as changes in these parameters is interesting. Function P* that considers the efficient number instances can be calculated using the expression [13]: where N is considered to be the whole number of instances in the training and the instance number is in the training with the distance that considers the smallest from an (in this attribute).
Value for x0 (or so) is being selected by the K* algorithm, n0, and N with a number in between is selected and overturn the above expression. The nearest neighbor algorithm will be obtained by choosing n0 and weighted instances by choosing N. For convenience, the blending parameter "b" is used to specify the number in which the blending is different for n0, b = 0% and for N, b = 100%, with values of intermediate and linearly interpolated [13].

Neural network
The neural network is an analogous, information that considers distributed processing structure being composed of a processing element (which can have a local memory and implement operations that are considered as localized information processing) interrelated together as connections by using unidirectional signal channels. Each output is rlated to one element that connects with branches ("fans out") into many collaterals as like (everyone has the same signal which is the output signal of the processing element) [14].

Support vector machine
In COLT-92 by Boser, Guyon and Vapnik introduced support vector machine (SVM). Since that time, it has become popular. This algorithm was theoretically developed from the theory of the statistical learning, and it considers the well-motivated algorithm since the 1960s [15].
Pattern classification consider the main problem that deals with, that means different types of patterns is classified using this algorithm. Now, different sort of pattern exists, i.e., linear and nonlinear. Patterns that are linear can be distinguished easily or are able to be separated in low dimension easily, and on the other hand, the patterns that are nonlinear cannot be distinguish easily or are not able to be separated easily, and thus these patterns sorts require manipulation in order to be easily separated [15].
The SVM main idea is the formation of a hyperplane that is considered an optimal, that is, able to be used for classification, in order to split the linear patterns. The selection of the optimal hyperplane is based on the selection of a hyperplane among the group of hyperplanes for the classification of the patterns in which the hyperplane margin is maximized like the distance between the nearest point of each pattern and the hyperplane. The main goal of SVM is that the margin is maximized in order and the process of the classification is preformed correctly of the given patterns, i.e., when the margin size is larger, the classification of the patterns is more accurate [15,18]. The hyperplane equation: The pattern that is given by using kernel functions is able to be assigned to higher space of dimension; the function of the kernel is Φ(x). I.e. x Φ (x), the various functions of kernel election are very necessary for the classification using SVM; usually, the functions of the kernel that are used contain RBF, linear sigmoid, and Poly. For example [14].
The Poly Kernel function equation is given as [14]: The basic concept of support vector machine is that a group of training sample is given (a) that contains a distributed sample which is considered identical and independent; the sample has xi, in which xi belongs to the Rd, and yi belongs to the {À1,1} and they both as {(xi,yi)}N i = 1, and they both refer to the classification input and output. The object is to determine wT.x + b = 0 that consider a hyperplane equation, in which two various samples are being split accurately. Hence, problemsolving with the classification that considers optimal is translated into quadratic programming for problems-solving. The search for a partition hyperplane is to maximize the area of bilateral blank (2/||w||), which means the weight of the margin has to be maximized. It is expressed as [14]:

Case study
The main problem with the construction industry that contains a number of risks, and in order to minimize these risks, a model should be used to analyze these risks. A scientific research methodology is adopted which includes three stages: 2. Risk responses concept and strategies in construction projects.
3. Studying cost elements, types, and factors affecting it with the study of the causes of their appearance.
a. Studying the artificial intelligence techniques and the steps of its procedures and its uses in the construction projects.
b. Studying the simulation methods and the steps of its procedures and its uses in the construction projects.

Field study
The field study includes the following:

Open questionnaire
This stage includes conducting many interviews with experts. The interviews include managers and university professors, and other parts of the projects in the following ministries: the Ministry of Higher Education and Scientific Research, the Ministry of Construction and Housing, and the Ministry of Education. These interviews have a very important role in helping the researcher in the later stage, also discussion about the questionnaire which is initially prepared from the literature and previous studies as well as doing some modifications on the form and adding another questions with the help of the experts to make sure of the success of the method and questions presented.

Closed questionnaire
After the interviews with many experts have been finished. The problems of the research were divided into several groups which including the risks that cause to cost overruns, the top risk and their impacts on the projects, the strategies that are used for each risk, the reasons for risk response failure and finally the risks generated from risk response.

Stage of system building and software design
In the light of the responses received from the questionnaire, the practical study is as follows: 1. Planning of risk.
3. Analysis of risk using decision tree and K-star machine.
4. Risk response evaluation using neural network and support vector.
In this model, two types of classification were used, descriptive classification by using decision tree and predictive classification by using K-star and as follow: • Identify the dependent variable.
• Identify the independent variable.
• Implement descriptive classification using a decision tree.
• Implement predictive classification using K-star.

Identify the independent variable of descriptive classification
The decision tree application is an example of a descriptive classification. This type of classification considers a number of attributes (variable) which affect the variable to be described. This type of classification is important to the variables that have an effect on the target. This research describes the method of classification by using a decision tree to describe the qualitative analysis of the risks of project cost based on historical data. The data used to develop the classification model were the past data from various engineering works in different ministries. The method that is used to collect data is the direct data gathering from the engineering and the direct interview with the engineers and managers.
Results gained were collected from two parts: first one is the literature survey and the second one is the field of investigation (interview and questionnaire analysis) as mentioned before; 23 variables were considered as the independent variables; these variables are risks and their probabilities are considered too high, high, medium, low, and too low, and the impacts as too high, high, medium, low, too low, which are shown in Table 1.

Dependent variables
Qualitative analysis is considered to be the dependent variable which is too high, high, medium, low, too low, and each individual engineer or manager is used as the basic unit of the observation. Therefore this model is considered to be an attempt to make a model consisting of the independent variables which could describe the qualitative analysis classification.

Decision tree implementation: Weka program
Waikato Environment for Knowledge Analysis (Weka) is a famous software in machine learning suite; the language used is Java; and University of Waikato, New Zealand developed this program. It is a software that considered as free, and the license of this program is a GNU General Public License. The Weka (said to rhyme similar to Mecca) is considered to be a workbench [16] which includes a group of tools for visualization and algorithms that are used to analyze the data and molding for the predictions; it is easy to access this function by using graphical user interface [15] and the version of Weka 3 that was developed in the early 1997 was used for many different implementation areas, especially for the purpose of education and research.
Several standard data mining tasks are supported by Weka, to be more specific, preprocessing of the data, regression, classification, clustering, visualization, and selection of the feature [17,19], as shown in Figure 2.
The Explorer is in the GUI which is opened and the explorer is pressed on to insert the file that needs to be classified and by pressing the bottom open file to insert the file in the preprocess window which is used to choose and modify the data being acted on. As shown in Figure 3.
At this stage, the model was uploaded and full information such as relation means the name of the file, the total number of instances, attributes, type, the missing value, and others is shown in the figure above.
After this stage, a classifier was selected to perform the descriptive analysis as shown in Figure 4.
In the section of Classify at the top, there is a Classifier box. This box has a number of text area that provides the name of the classifier that the research work with, and by clicking on the tree bottom, there are several algorithms available under this option, the researcher selects j48 algorithm which one the application and implementations of C4.5 as shown in Figure 5.   In this step, the properties of the algorithm are selected. In this model after the trial and error, the confidence factor is 0.25, the debug is false, min NUM obj is 2, NUM Folds are 3, and the unpruned option is true, which give the researcher the best results achieved as shown in Tables 2 and 3.
According to the tree, the risk with medium impact has the following probability: medium-13 risks, low-3 risks, and too low and high-there is no medium classification; while the risk with low impact has the following probability: medium there are 7 class low, too low has 1 class, low has 36 class, and high does not have  Labor production Low Low any class; on the other hand, the risk that has the impact high has class high with two of them are wrongly classified and the risks with impact too low and too high have the class of one too low and one too high, respectively, as shown in Figure 6.

K-star implementation: Weka program
As mentioned, Weka is a popular software for machine learning, and this type of algorithm will be used for predictive classification to predict the qualitative analysis of the risks for the periods 2014-2016 depending on the qualitative analysis for the previous periods from 2006 to 2014. The result is shown in Table 4.  Table 3.
The classified and the actual data of decision tree the WEKA program.

Risks Actual Classified
Price   Correctly classified instances 91.304% This is considered being good classification accuracy Incorrectly classified instances 8.6975 There is no classification error Kappa statistic 81.85% Consider being good value as compared to the realistic Table 5. The correctly and incorrectly classified instance using cross validation in K-star. Thus the probability for each instance is calculated in the category of the qualitative analysis, and the highest probability is taken for the classification of the new instance.
The process of opening the program and loading the file is mentioned earlier; the next step is choosing the properties of the algorithm as shown in Figure 7.
The global blending was taken as 20 after several trial and error; it was found to be the best result, the entropy autoblending which means entropy base blending in this case, it does not use for better accuracy.
The training data set has good accuracy, this high accuracy because the algorithm uses an entropy distance and use the whole data as training data. As shown in Figure 8.
After this step, the testing data set is uploading to perform the prediction classification.
In order to make a comparison between the two techniques, the whole data are used in cross-validation to make a comparison. As shown in Table 5.

KNIME implementation
Using the program, risk response failure in construction project was analyzed using the following techniques.

Neural network
This technique was used as part of the model to describe the risk response failure.
This workflow represents the neural network model right-click it and select "Configure" from the menu, as shown in Figure 9.
The max number of iterations was selected 100 and by trial and error the number of the hidden layers was 1 and hidden neuron were 10, as shown in Figure 10.
The results are shown in Tables 6 and 7.

Support vector machine
This technique was used as part of the model to predict risk response failure. As shown in Figures 11 and 12.   Table 6.
The results of the risk response effectiveness in the 41 projects.   risk which is the quality control on the material and expertise in execution and that leads to an error.
3. The result from the statistical analysis results in the period of (2014-2016) show that the risks that have the highest qualitative analysis are same that resulting from the classification result by using j48 algorithm except for two risks which are financial difficulty by owner and Changes in the purchase costs or delay in the delivery of equipment and machinery and that leads to an error.
4. Different risks were found due to the different condition of these periods, however, some risks were the same as the delay in completing the project, exceptional circumstances, and the wrong estimation existed in every period

The decision tree is a successful quality technique in risk analysis
The result from the statistical analysis results in a period of 2014-2016 showed that the risks that have the highest qualitative analysis are same that resulting from the classification result by using K-star algorithm except for one risk which is financial difficulty by owner.
Second: Risk response identification • In the periods of (2006-2007) the method that used for risk response selection was historical information for similar previous projects has the mean of 4.03 that means often this method used for selection.
• In the periods of (2008-2013) the method that used for risk response selection was historical information for similar previous projects has the mean of 3.73 that means often this method used for selection.
• In the periods of (2014-2016) the method that used for risk response selection was historical information for similar previous projects has the mean of 3.80 that mean often this method used for selection.
• The existing methods and tools for selecting a risk response are based on historical information for similar previous projects, m.aking them easily affected by anxiety, uncertainty.
• Three techniques to identify risk response failure.
a. The decision tree shows the high accuracy and that because it considers the best algorithm in prediction of nominal class b. The neural network shows the lower accuracy as the nature of the algorithm tends more to the numerical class.
c. The support vector machine show good results close to the decision tree • The most important reasons that risk response fails in the period of (2006)(2007) were The difficulty of implementing a risk-response plan correctly for internal factors (terrorism and sabotage), Multiple decision sources for selecting a response strategy, Inadequate strategy with high risk and The inability to introduce sophisticated management methods to respond to risks.