InTechOpen uses cookies to offer you the best online experience. By continuing to use our site, you agree to our Privacy Policy.

Chemistry » "Statistical Approaches With Emphasis on Design of Experiments Applied to Chemical Processes", book edited by Valter Silva, ISBN 978-953-51-3878-5, Print ISBN 978-953-51-3877-8, Published: March 7, 2018 under CC BY 3.0 license. © The Author(s).

Chapter 2

Design of Experiments Applied to Industrial Process

By Neelesh Kumar Sahu and Atul Andhare
DOI: 10.5772/intechopen.73558

Article top

Design of Experiments Applied to Industrial Process

Neelesh Kumar Sahu and Atul Andhare
Show details


Response optimization and exploration are the challenging task in front of experimenter. The cause and effect of input variables on the responses can be found out after doing experiments in proper sequence. Generally relationship between response of interest y and predictor variables x1, x2, x3, … xk is formed after carefully designing of experimentation. For examples y might be biodiesel production from crude ‘Mahua’ and x1, x2 and x3 might be reaction temperature, reaction time and the catalyst feed rate in the process. In the present book chapter, design of experiment is discussed based on predictor variables for conducting experiments with the aim of building relationship between response and variables. Subsequently a case study is also discussed for demonstration of design of experiments for predicting surface roughness in the machining of titanium alloys based on response surface methodology.

Keywords: design of experiments, response surface methodology, optimization, ANOVA

1. Introduction

Researchers found the unknown solutions by conducting experiments with the help of varying two or more inputs factors [1]. Typical solutions are obtained from experiments are:

  • Effect of input variables over the solutions or responses

  • Which combination of input variables will give best solution?

  • What are ranges of variables suitable for experiments?

  • Under what condition should we operate our plant?

Experiments help us to direct compare among treatments of interest. Design of experiments minimizes bias in the comparison which helps in reducing error [2]. One of the advantages in design of experiments that we can control the experiments which allows us to make decision about influence of input variables over the response. Explicitly, one can make conclusion about causation.

An experiment consists of treatments, experimental unit, responses and a method to assign treatments to unit. Mosteller and Tukey [3] describes three concepts for the development of relationship between variables and responses namely consistency responsiveness and mechanisms. Proper design of experiments should avoid systematic error, should be precise, allows estimation of errors and have broad validity.

Some important terms and concepts used in design of experiments are listed below

1.1. Treatment

It defines as are the diverse actions for equate. Amount of fertilizers in agronomy, different long distance rate structure in marketing or different temperatures in reactor vessel in chemical engineering are examples of treatments.

1.2. Experimental units

These are units in which treatments are applied. Graph are plotted for to see variation of these units over response.

1.3. Responses

These are the outputs we measures during experiments. These responses define the mechanism of the process during experiments. Responses for examples might be fatty acid ethyl ester nitrogen content in biodiesel production or combustion performance biodiesel biomass of corn plants, profit by production, or yield and quality of the product per ton of raw material.

1.4. Randomization

It is distribution of variables within the range with recognized, defined probabilistic mechanism for the assignment of treatments to units.

1.5. Experimental error

It is defined as variation present in all experimentally measured responses. Experiments runs on different range of variables will give different results for responses. Moreover conducting experiments at the same range of variables over and over again will give different results in different trials. It should be noted that experimental errors within acceptable range does not indicate conducting wrong experiments.

1.6. Measurement units

It is the unit of measured responses for example combustion pressure in different % blend of biodiesel. These may differ from the experimental units. For example Fertilizer is applied to a plot of land containing corn plants, some of which will be harvested and measured. The plot is the experimental unit and the plants are the measurement units. Ingots of steel are given different heat treatments, and each ingot is punched in four locations to measure its hardness. Ingots are the experimental units and locations on the ingot are measurement units.

2. Design of experiments

An experiment can be defined as a test or series of runs in which purposeful changes are made to the input variables of a system or process so that changes in the output response variable may be observed and the reasons for the same may be identified [4, 5, 6]. Some process variables x1, x2, … xp are controllable, whereas other variables z1, z2, … zq may be uncontrollable. An experiment serves the following purposes:

  1. Determine which variables x1, x2, … xp are most influential on response y.

  2. Determine where to set the influential x’s so that y is always near to the desired nominal value.

  3. Determine where to set the influential x’s so that variability in y is minimized.

  4. Determine where to set the influential x’s so that effects of uncontrolled variables are minimized.

Design of Experiments refers to the process of planning, designing and analyzing the experiment so that valid and objective conclusions can be drawn effectively and efficiently [7]. In order to draw statistically sound conclusions from the experiment, it is necessary to integrate simple and powerful statistical methods into the experimental design methodology [8]. The success of any industrially designed experiment depends on sound planning, appropriate choice of design and statistical analysis of data and teamwork skills.

2.1. Approaches for experimentation

The approach to planning and conducting the experiment is called the strategy of experimentation [9]. The best guess approach is the most common and uses guesswork to arbitrarily select a combination of input factors for testing. However, this is unscientific and one cannot confirm whether a better response obtained is indeed the best solution.

Another approach is the ‘one factor at a time’ (OFAT) in which one factor is sequentially varied at a time by different levels and all other factors are kept constant. The levels may be quantitative (such as temperature or voltage) or qualitative (such as presence of coolant). The main effect of the factor is the change in response produced by a change in the level of the factor. However, OFAT approach can show only one causal effect and many a times, the causal effect of multiple factors is not additive, meaning there is interaction between them. An interaction is the failure of one factor to produce the same effect on the response at different levels of another factor. OFAT approach cannot give interaction effects as all other factors are kept constant when a factor is varied.

The scientific approach therefore is to vary several factors together at a time so that both main effects as well as interaction effects of factors on the response variable may be identified and studied. This is called factorial experimental design and this is the only way to discover interactions between variables. In factorial experiments, factors contain discrete values (levels), and the number of factor levels influences design of experimental runs. When all possible combinations of the levels of the factors are investigated, then it is called a full factorial experiment. In contrast, a fractional factorial experiment is a variation of the full factorial design in which only a subset of the runs is used.

Various other kinds of experimental designs are in place such as Plackett-Burman design, Taguchi method, response surface methodology, mixed response design and Latin hypercube design [10]. Each of these designs uses different techniques to generate experimental runs. Of these, response surface methodology is of particular interest as it takes three levels of factors to generate an experimental design sequence and uses a quadratic polynomial model for conducting analysis.

The three principles of experimental design such as randomization, replication and blocking are used in industrial experiments in order to improve the efficiency of experimentation. Randomization is the random ordering of experiments to ensure all levels of a factor have equal chance of being affected by noise factors (unwanted sources of variability) such as temperature or power fluctuation. Replication is the process of repeating all or a part of experiment runs in a random sequence to allow more precise estimation of experimental error as well as main and interaction effects. Blocking is the process of arranging similar experimental runs into blocks (or groups) to distribute the effect of change in blocking factors such as batch, machine, time of day, etc. across the experiments and avoid confounding (confusion whether the output change is due to change in block or change in factor level).

For statistical analysis under design of experiments (DOE), the factor level numbers are considered instead of the actual value of the factor at that level. In other words, the factors are represented by coded variables instead of natural or uncoded variables. In case of categorical variables, the levels are represented in natural numbers as 1, 2, … l. Quantitative variables can also expressed in this manner in many experimental design methods.

Let xi and wi be the coded and uncoded values respectively for a level i of a control variable having li levels. Then wlow and whigh refer to the uncoded values of the factor at the lowermost and uppermost levels respectively. For categorical variables, xi and wi are expressed as Eqs. (1) and (2).




In case of response surface methodology, the number of levels for all quantitative variables is odd, and the middle level is given the value 0. Thus the remaining levels get equally distributed on both sides of the middle level, for example, −2, −1, 0, +1, +2. Then, xi and wi would be expressed as Eqs. (3) and (4).




3. Response surface methodology

Response surface methodology or RSM is a collection of mathematical and statistical techniques used for the modeling and analysis of problems in which a response of interest is influenced by several variables and the objective is to optimize the response. The method was introduced by G. E. P. Box and K. B. Wilson in 1951. It uses a sequence of designed experiments to obtain an optimal response and uses a second-degree polynomial model to achieve this.

Let a process contain n input variables x1, x2…, xn. Then the response y is given by Eq. (5)


Where, ε is the error or noise observed in the response. If the expected response is denoted by Ey=fx1x2xn=η , then the response surface is represented by Eq. (6)

The response can be represented graphically, either in the three-dimensional space or as contour plots that help visualize the shape of the response surface. Contours are curves of constant response drawn in the xi, xj plane keeping all other variables fixed. Each contour corresponds to a particular height of the response surface. RSM also explores relationships the response variables and several input variables. If the response is modeled by a linear function of the independent variables, then the approximating function is the following linear model shown by Eq. (7).


If there is curvature in the system, then a polynomial of higher degree must be used. Most of the industrial problems can be modeled with sufficient accuracy by using a second-degree polynomial, which yields the following second order model shown by Eq. (8)


The method of least square chooses β’s in Eq. (8) so that the sum of the squares of the errors ε, are minimized. The least squares function is shown by Eq. (9)

By putting value of εi from above equation and differentiating equation with respect to coefficient β, regression coefficient can be obtained.

3.1. Response surface designs

Response surface designs are those experimental designs which are used for fitting response surfaces and generally contain three factor levels [11]. Two types of response surface designs are used namely, central composite design and Box-Behnken design.

3.1.1. Central composite design

This consists of a factorial design (the corners of a cube), center and axial (or star) points that allow for estimation of second-order effects [12]. The addition of axial points practically increases the number of levels to five as shown in Figure 1. This may create problems if the axial points cannot be run due to technical or safety reasons. For a design having k factors, the distance of the axial point from the design center is α=2k/4 .


Figure 1.

Central composite design for three factors.

A central composite design containing axial points with the calculated value α is called circumscribed central composite design. If it is not possible to use this value of α, then a provision exists in which α can be taken equal to 1 in order to obtain what is called as face centered central composite design.

3.1.2. Box-Behnken design

This design overcomes some loopholes of central composite design by avoiding axial points and corner points of the design space (or bypassing extreme factor combinations) and by taking only three factor levels as shown in Figure 2. The design ensures that all factors are never set to high levels simultaneously and thus ensures design points within safe operating limits.


Figure 2.

Box-Behnken design for three factors.

Also, this design is fully rotatable, meaning that it provides the desirable property of constant prediction variance at all points that are equidistant from the design center. Compared to central composite design, this design gives lesser number of experiment runs for the same number of factors. Hence, it can be seen that Box-Behnken designs have several advantages over central composite designs.

3.2. Analysis of variance (ANOVA)

The analysis of variance (ANOVA) established by Ronald Fisher in 1918, is a statistical tool used to analyze variation among and between groups. ANOVA is used to see the significant and insignificant parameters of the predicted model. This procedure involves checking individually variability of variable over the response [13]. It is based on the concept of two hypotheses namely H0 (means all the regressions coefficients are zero) and H1 (mean at least one of the regression coefficient is non-zero). If H0 is false then it suggests that one or more of the variable contribute significantly to the developed model for response [14]. In this test procedure, sums of square of regression and errors are calculated. To verify hypothesis F value is calculated as ratio of mean of square (regression) to mean of square (error) is calculated. Larger values of F suggest that model is significant. Alternatively, p value is the probability of the predicted model shows its significance in terms of statistics. If p value is less than 0.05 model terms are significant and p value greater than 0.05 indicates that model terms are not significant. Similarly the value of R2 (correlation coefficient) is calculated as ratio of sum of square of regression to the total sum of square. The correlation coefficient (R2) value suggests a satisfactory representation of process by model and good correlation between experimental and theoretical values provided by the model equation. For goodness of fit of the model, R2 (correlation coefficient) should be at least 0.80. However, a large value of R2 does not necessarily imply that the regression model is good one. Adding a variable to the model will always increase R2, regardless of whether the additional variable is statistically significant or not. Thus it is possible for models that have large values of R2 to yield poor predictions of new observations or estimates of the mean response. Therefore sometimes it is beneficial to calculate adjusted correlation coefficient (R2adj) which is calculated as (1 − sum of square (error)/sum of square (total)). Once R2 and R2adj are different affectedly, there is a decent probability that non-significant terms have been included in the model.

3.3. Backward elimination approach for developed model evaluation

After developing a model, its adequacy is checked by F test and p value [15]. For a model term to be significant it should have high F value and low p value. Insignificant model terms do not affect the response therefore can be removed from the model. In order to avoid insignificant terms in the model such that modified model clarifies the response, the backward regression elimination method (also known as stepwise deletion) is used. In the stepwise deletion method, t test or F test for significance of design variable is performed with sequence begin with full model. Insignificant variables with the highest p value (e.g. p > 0.05) are removed from the full model. Stepwise regression procedure details are as follow:

Step 1:

Initially the model can be written as shown in Eq. (10)


Then, the following n−1 tests are carried out, for null hypothesis Hoj: βj = 0. The lowest partial F-test value Fl corresponding to Hoj: βj = 0 or t-test value tl is compared with the preselected significance values F0 and t0. One of two possible steps (step 2a and step 2b) can be taken.

Step 2a:

For eliminating any variable say xl, it should satisfy the following case Fl < F0 or tl < t0. Now the modified model can be written as equation


Step 2b:

If Fl>F0 or tl>t0 , the original model is the model we should choose.

The procedure will automatically stop when no variable in the new original model can be removed and all the next best candidate cannot be retained in the new original model. Then, the new original model is our selected model.

In the present thesis, measured responses after machining are analyzed using responses surface methodology with cutting parameters as input variables. Initially RSM models are developed for each response. Significance of each variable is confirmed through ANOVA analysis then insignificant terms are removed using backward elimination approach. Analysis of machining responses is discussed in above sections.

4. Case study for using design of experiments in machining operation

Surface roughness is most widely used indicator to quantify surface integrity of machined part [16, 17]. It directly gives quality of surface finish and has been used by many researchers. Surface roughness is influenced by several factors such as - cutting speed, feed, depth of cut, tool geometry, tool wear, etc. [17, 18, 19, 20]. Therefore in the present work surface roughness is taken as response.

In the present case study, design of experiments with central composite design was performed based on response surface methodology. This is constructed as factorial design (the corners of a cube), center and axial (or star) points that allow for estimation of second-order effects [21]. The addition of axial points practically increases the number of levels to five. This may create problems if the axial points cannot be run due to technical or safety reasons. For a design having k factors, the distance of the axial point from the design center is α = 2k/4 as shown in Figure 3. If it is not possible to use this value of α, then a provision exists in which α can be taken equal to 1 in order to obtain what is called as face cantered central composite design. In the present case study, based on input factors and their levels as shown in, 20 set of experiments were performed, each for turning and milling operations. The design of experiments was performed using MINITAB 17 statistical software. For the present work, based on number of input factor k, the value of α was taken as 1.682. The coded and natural levels of the independent variables for design of experiments are presented in Table 1. Five levels of cutting parameters were calculated in central composite design using Eq. (12) shown above. After defining levels of cutting parameters, sequence of experiments were generated using MINITAB 17 statistical software using central composite design for turning and milling operations. Table 2 shows the 20 sets of experiment in terms of coded values of cutting parameters sequenced according to run order. The number of experiments was generated based on number of input factors and their levels.


Figure 3.

Design of experiment using central composite design.

Level ->LowestLowCenterHighHighest
Coded value (x)−1.682−1011.682
Cutting speed Vc (m/min) turning69.990.4120150171.4
Feed rate f (mm/min) turning55.67296120.6136.6
Depth of cut ap (mm) milling1.832.02.522.67

Table 1.

Level of cutting parameters used for central composite design.

Std orderRun orderPt typeBlocksCutting speedFeed rateDepth of cut

Table 2.

Sequence of experiments obtained using MINITAB.

Where x is coded value of level of individual cutting parameter, Vc is cutting speed in m/min, f is feed rate in mm/rev, ap = depth of cut in mm.

In the present case study, minimization of surface roughness is done for turning and milling operations. Surface roughness was measured for each machining operation. In order to compensate measuring error, surface roughness was measured at three locations on the machined surface and average value is taken. Table 3 show the list of experiments and corresponding surface roughness in turning operations.

Run typeCutting speed Vc (m/min)Feed rate f (mm/min)Depth of cut ap (mm)Surface roughness Ra (μm)

Table 3.

Surface roughness measurement after turning operation.

Second order models are developed for surface roughness in turning using RSM. After developing models, ANOVA analysis is done to see significant and insignificant terms in the models as shown in Table 4. Insignificant terms are identified and eliminated using backward elimination procedure. In Table 4, the variable for which the value of ‘p’ is less than 0.05 indicates that the term in the model has a significant effect on the response.

SourceSum of squareDFMean of squareF valuep value
Prob > F
Vc × f4.572e-514.572e-50.0390.8474
Vc × ap1.591e-411.591e-40.140.7203
ap × f2.841e-512.841e-50.0240.8794
Lack of fit9.587e-351.917e-34.470.629
Pure error2.143e-354.286e-4
Core total0.3219

Table 4.

ANOVA analysis for surface roughness as response and cutting parameters as variables in turning operation.

The ANOVA results shown in Table 4 demonstrate that the model is highly significant, and the lack of fit is non-significant. Model showed a correlation coefficient (R2) of 93.13% for turning which means more than 90% of the data can be explained by these models. Furthermore, the significance of each coefficient in the full model was examined by the F-values and p-values. Larger values of “F” and smaller values of p (p < 0.1) indicate that the corresponding variable is highly significant. Hence, the results given in Table 4, suggest that the influence of f2 (square of feed rate), ap2 (square of depth of cut), Vc × f (cutting speed × feed rate), Vc × ap (cutting speed × depth of cut), and f × ap (feed rate × depth of cut) are non-significant and therefore, can be removed from the full model to further improve the mode as shown in Eq. (13).


4.1. Validation of developed model for surface roughness in turning operation

In order to verify the adequacy of the model developed, five validation experiments were performed as depicted in Table 5. The conditions were those which have not been used previously but are within the range of the levels defined previously. The predicted values from the equation developed for surface roughness and the actual experimental value were compared. The percentage errors were calculated. All these values are presented in Table 5. The percentage error range between the actual and predicted value is −3.18 to 13.69% which is acceptable. Residual from the least square fit is defined by ei = yi − y* for i = 1, 2,….20 where yi is the observed response (Surface roughness) and y* is the predicted response. A check of the normality assumption may be made by constructing a normal probability plot of the residuals. If the residuals plot is approximately along a straight line, then the normality assumption is satisfied. Figure 4 presents a plot of residuals ei versus the predicted response y* and it reveals no apparent problem with normality.

ParametersExp. 1Exp. 2Exp. 3Exp. 4Exp. 5
Cutting speed (m/min)72.595.0110.0130.0160.0
Feed rate (mm/min)6080100125140
Depth of cut (mm)
Predicted Ra (μm)0.7970.6810.6340.5650.464
Actual Ra (μm)0.7720.7480.7210.63250.501
% Error−3.189.9613.6911.958

Table 5.

Confirmation experiments for validating surface roughness model for turning operation.


Figure 4.

Normal probability plot of residual for surface roughness in turning operation.

From the confirmation experiments and normal probability plot of residual, it is observed that the developed model can predict the surface roughness in turning operation. Figure 5 shows the response surface plots give a graphical display of these quantities. Typically, the variance of the prediction is also of interest, because this is a direct measure of the likely error associated with the point estimate produced by the model.


Figure 5.

Response surface plots.

From the response surface plots also, it is observed that interaction of cutting speed/feed rate is strongly affecting the surface roughness value whereas interaction of feed/ doc and cutting speed/doc has negligible effect over surface roughness [22, 23].

5. Summary

From the above study it can be concluded that experimenter can predict the response using proper design of experiment where proper underlying mechanism of the process is not fully understood. Proper fitting of response from experimental data can be done by design of experiment, regression modeling technique, statistical analysis and optimization. Following conclusions can be made based on the case study:

  • Design of experiments is a very structured methodology for planning and designing a sequence of experiments.

  • Analysis of variance (ANOVA) was used to identify significant input variables for particular response.

  • Prediction model can be developed for a response with correlation coefficient more than 90% which confirm that the models properly explain the experimental data.

  • The developed predictive model can help industries in achieving appropriate output for improving productivity.


1 - Anderson-Cook CM, Goldfarb H, Borror CM, Montgomery DC, Canter KG, Twist JA. Mixture and mixture process variables experiments for pharmaceutical applications. Pharmaceutical Statistics. 2004;3:247-260
2 - Chung PJ, Goldfarb HB, Montgomery DC. Optimal designs for mixture-process experiments with control and noise variables. Journal of Quality Technology. 2007;39:179-190
3 - Mosteller F, Tukey JW. Data Analysis and Regression: A Second Course in Statistics. Addison-Wesley Series in Behavioural Science: Quantitative Methods. Boston, MA, USA: Addison-Wesley Publishing Company; 1977
4 - Myers RH, Montgomery DC, Anderson-Cook CM. Response Surface Methodology: Process and Product Optimization Using Designed Experiments. New Jersey, USA: John Wiley & Sons; 2016
5 - Choudhary AK, Chelladurai H, Kannan C. Optimization of combustion performance of bioethanol (water hyacinth) diesel blends on diesel engine using response surface methodology. Arabian Journal for Science and Engineering. 2015;40(12):3675-3695. DOI: 10.1007/s13369-015-1810-y
6 - El-Tayeb NSM, Yap TC, Venkatesh VC, Brevern PV. Modeling of cryogenic frictional behaviour of titanium alloys using Response Surface Methodology approach. Materials & Design. 2009;30(10):4023-4034. DOI: 10.1016/j.matdes.2009.05.020
7 - Condra LW. Reliability Improvement with Design of Experiments. New York: Marcel Dekker; 1993
8 - Cornell JA. Experiments with Mixtures, Designs, Models, and the Analysis of Mixture Data. 3rd ed. New York: John Wiley & Sons; 2002
9 - Allen TT, Yu L, Schmitz J. An experimental design criterion for minimizing meta-model prediction errors applied to a die casting process design. Applied Statistics. 2003;52:103-117
10 - Montgomery DC. Design and Analysis of Experiments. New Jersey, USA: John Wiley & Sons; 2017
11 - Drain D, Carlyle WM, Montgomery DC, Borror CM, Anderson-Cook CM. A genetic algorithm hybrid for constructing optimal response surface designs. Quality and Reliability Engineering International. 2004;20:637-650
12 - Draper NR. Center points in second-order response surface designs. Technometrics. 1982;24:127-133
13 - Andrews DF. A robust method for multiple linear regression. Technometrics. 1974;16:523-531
14 - Bartlett MS, Kendall DG. The statistical analysis of variance heterogeneity and the logarithmic transformation. Journal of the Royal Statistical Society, Series B. 1946;8:128-150
15 - Cornell JA. Fitting models to data from mixture experiments containing other factors. Journal of Quality Technology. 1995;27(1):13-33
16 - Ulutan D, Ozel T. Machining induced surface integrity in titanium and nickel alloys: A review. International Journal of Machine Tools and Manufacture. 2011;51(3):250-280
17 - Che-Haron CH, Jawaid A. The effect of machining on surface integrity of titanium alloy Ti–6% Al–4% V. Journal of Materials Processing Technology. 2005;166(2):188-192. DOI: 10.1016/j.jmatprotec.2004.08.012
18 - Razfar MR, Asadnia M, Haghshenas M, Farahnakian M. Optimum surface roughness prediction in face milling X20Cr13 using particle swarm optimization algorithm. Proceedings of the Institution of Mechanical Engineers, Part B: Journal of Engineering Manufacture. 2010;224:1645-1653
19 - Sharif S, Mohruni AS, Noordin MY, Venkatesh VC. Optimization of surface roughness prediction model in end milling Titanium Alloy (Ti-6Al4V). In: Proceeding of ICOMAST2006, International Conference on Manufacturing Science and Technology, 2006. Melaka, Malaysia: Faculty of Engineering and Technology, Multimedia University; 2006. pp. 55-58
20 - Mukherjee I, Ray PK. A review of optimization techniques in metal cutting processes. Computers & Industrial Engineering. 2006;50(1–2):15-34. DOI: 10.1016/j.cie.2005.10.001
21 - Box GEP. The effect of errors in the factor levels and experimental design. Technometrics. 1963;6:247-262
22 - Sahu NK, Andhare AB. Optimization of surface roughness in turning of Ti-6Al-4V Using Response Surface Methodology and TLBO. In: ASME 2015 International Design Engineering Technical Conferences and Computers and Information in Engineering Conference. New York USA: American Society of Mechanical Engineering; 2015
23 - Sahu NK, Andhare AB. Modelling and multiobjective optimization for productivity improvement in high speed milling of Ti–6Al–4V using RSM and GA. Journal of the Brazilian Society of Mechanical Sciences and Engineering. 2017;39(12):5069-5085