## 1. Introduction

Economic pressure and the need to target more competitive levels drive organizations to invest in efficient methodologies to get solutions able to provide clear advantages in a very demanding market. In this scenario, statistical approaches emerge as valuable tools to be used in the chemical process industry. Indeed, the chemical industry uses a wide set of statistical methodologies, ranging from descriptive approaches to complex optimization topics such as Design of Experiments (DoE), always targeting safer, more repeatable and profitable solutions.

Chemical processes often present a complex nonlinear multivariate nature where several factors influence significantly the final outputs. Traditional one-by-one experiment optimization implies the testing of factors one at a time instead of conducting all of them simultaneously. This approach presents several drawbacks, namely requiring an excessive number of experiments, missing the optimal set of factors and neglecting the interactions between the factors [1]. These interactions could play a key role on the system performance. Furthermore, this procedure is very time-consuming. This suggests the DoE approach rather than fundamental or mechanistic models [1, 2, 3]. Besides these clear advantages, the DoE implementation is an easy way to reduce the sources of variability in a process as well as is the first step to guide to an optimized solution [3, 4].

From a practical standpoint, with DoE implementation, users can find the best solution for any measurable process within corresponding constraints. To do so, the following elements are required:

An objective function to maximize or minimize a response.

A predictive model able to describe the main trends of the system.

Variables then can be adjusted to satisfy the process constraints.

One of the first steps to generate a good predictive statistical model based on DoE is to determine how far it is worth going in the number of factors that really affect the process. Previous data are of major relevance to select a small set of factors. On the other hand, when the users select too many factors, these are often unnecessary and always lead to complex problems to solve.

After selecting the factors and corresponding ranges, the experimental design should be run in a random way. At this stage, empirical models are generated and their adequacy should be evaluated by different statistical procedures such as R measures, analysis of variance (ANOVA) or diagnosis of residual abnormalities [5]. Now, response surface methodology (RSM) should be used to provide the optimal operating conditions for different system responses. This allows generating polynomial functions which determine the minimum, the maximum or a desired value within a range for each response of interest. Optimization can be carried out considering a single response, or taking advantage of the desirability concept, multiple responses with different restrictions [6].

Also, the generated model can be used for robust design purposes [3]. In fact, the computer-aided optimization might set the process on a sharp peak of response and in such manner the system will not be robust to variation transmitted from input variables. Advanced statistical methods (propagation of error is a valuable option among others) can be used to find the flats on response surfaces. These regions are desirable because they do not get affected much by variations in factor settings. Improvements are still possible by narrowing tolerance intervals. To accomplish such goal, several engineering decisions can be taken: (a) accept the response variation as reasonable for this kind of process; (b) change process design specifications; (c) improve the measurement system; (d) improve the control process; or even (e) decline the system as effective to achieve the required targets.

A cost-effective solution could be found narrowing standard deviations from input factors by improving the measurement system or the control process [3, 6]. This is a huge step to become a chemical process more repeatable and predictable, leading to significant money saves.

Previous lines show how statistical models can be used to get optimized and robust solutions in typical problems coming from chemical processes. Additional challenges can also be found and overtaken by using statistical approaches. When the number of significant factors affecting the process is too high, an overwhelming number of runs should be prevented. In such cases, a high fractional design (minimal resolution III or IV designs) could be adopted [4]. To avoid aliases, the minimal resolution designs should be combined with a complete fold-over methodology. This means that a second block of runs with signs reversed on all factors should be included breaking the aliases between main effects and two-factor interactions. Other usual problem found in the chemical industry is to design a set of experiments where both operating and mixture parameters occur. An illustrative example is a baking experiment where besides one observable process variable, six mixture components make part of the input factors [7]. Typical designs do not work at this level. A special type of mathematical formulation is required, and the adequate solution relies on the crossed mixture-process design application. With such approach, it is possible to combine quantitative parameters with mixture component restrictions.

Sometimes, standard RSM designs are not the best option or even suitable to solve chemical problems. Indeed, when multifactor linear constraints or categorical factors are involved, optimal designs are the correct option [6]. This is also the best solution when cubic empirical models are necessary to best fit experimental data.

Many other challenging problems can be faced by using advanced DoE strategies such as nested and split designs, experiments with random factors or even evolutionary operation methods (needed for the continuous improvement of a full-scale process) [2].

Parameter designs (two-array) that are made popular by Taguchi are other suitable option in order to find optimal operating conditions for quality improvement purposes in different chemical processes [8]. Although the approach from Taguchi became extremely popular as an effective tool for quality improvement in 1980s, a large controversy arose because there were significant issues with the advocated experimental strategy and data analysis procedures [9]. Additionally, it was concluded that fractional designs deliver considerable more information becoming much more efficient than the two-array parameter designs developed by Taguchi. Anyway, the use of Taguchi methods is still valid and of utmost importance at industrial level.

Assisting graduate students, teachers, researchers and other professionals by giving them the necessary knowledge on statistical tools with emphasis on DoE approaches that are available to them is perhaps the easiest way to expedite the mainstream of this methodology. In doing so, they can deepen the fundamental theoretical knowledge on the topic as well as optimize chemical processes with more efficient approaches. A more efficient process will be more cost-effective (thus increasing the interest to commercialize it) while improving its performance. Therefore, the aim of this book is to serve as a starting point for new researchers (and experienced ones) wanting to do statistical (emphasis on DoE) analysis of chemical processes.