Application of Artificial Neural Networks to Chemical and Process Engineering

The accelerated use of Artificial Neural Networks (ANNs) in Chemical and Process Engineering has drawn the attention of scientific and industrial communities, mainly due to the Big Data boom related to the analysis and interpretation of large data volumes required by Industry 4.0. ANNs are well-known nonlinear regression algorithms in the Machine Learning field for classification and prediction and are based on the human brain behavior, which learns tasks from experience through interconnected neurons. This empirical method can widely replace traditional complex phenomenological models based on nonlinear conservation equations, leading to a smaller computational effort – a very peculiar feature for its use in process optimization and control. Thereby, this chapter aims to exhibit several ANN modeling applications to different Chemical and Process Engineering areas, such as thermodynamics, kinetics and catalysis, process analysis and optimization, process safety and control, among others. This review study shows the increasing use of ANNs in the area, helping to understand and to explore process data aspects for future research.


Introduction
Many chemical and process engineers are excited about the applications of Artificial Intelligence (AI) to their fields of expertise. AI can be defined as the ability of digital-computers to perform tasks at which people are better, at the moment [1]. In this context, Machine Learning (ML) is seen as one of the most relevant subareas, providing computers with the ability to learn without being programmed explicitly. It is essentially a form of applying statistics to estimate complex functions with less emphasis on obtaining the confidence intervals around them [2].
This current excitement was also stimulated by the Big Data boom related to the analysis and interpretation of large data volumes (of the order of several terabytes), which are generated at high rates and present various formats (numbers, text, multimedia, among others). Industry 4.0 requires this piece of knowledge from chemical and process engineers since process plants have large volumes of stored

Most common activation functions used in chemical and process engineering applications
A neural network contains hyperparameters to be tuned prior to training in order to achieve the best configuration. Among them, the following can be mentioned: (i) number of hidden neurons, (ii) activation function, (iii) optimizer, and (iv) regularization and their dependencies (learning rate, optimizer specific, dropout rate, etc.).
Particularly, activation functions determine the output of the model, its accuracy, and the computational efficiency of training a model; therefore, they are an essential part of the structure of the neural networks. The Sigmoid function, Hyperbolic Tangent (TanH), and ReLU (Rectified Linear Unit) are the most common in Chemical Engineering; however, recent studies improve these classical activation functions, defining new ones, such as Leaky ReLU, Swish, H-Swish [11].
In the sigmoid activation function, the output values are bounded between 0 and 1, normalizing each neuron output. However, there is a problem with the vanishing gradient, and outputs are not zero-centered. To make the modeling easier, the TanH was proposed, for which the outputs are zero-centered, i.e., when the inputs contain strongly negative, neutral, and strongly positive values.
In order to circumvent the computational expense, the ReLU was proposed. It is a computationally efficient linear activation function that will output the input directly if it is positive; otherwise, it will output zero. A further development is the Leaky ReLU, whereby the slope is changed to the left of x = 0, avoiding the dying ReLU problem, whereby some neurons can die for all inputs and remain inactive.
Therefore, the correct definition of the activation function is a fundamental part of the hyperparameter tuning to guarantee the best configuration of a neural network. In the course of the chapter, we will always mention which activation function each work used in the summary tables.

Applications to chemical and process engineering
In recent decades, there have been a large number of studies using ANNs in chemical engineering, from molecular property prediction [12], fault diagnosis [13], predictive control [14], and optimization [15,16]. The use of first-principles knowledge must be integrated with the neural network in order to retain a more physical understanding of the system [14]. In the following subsections, we presented the principal papers of each area, with tables summarizing the characteristics of the ANNs used.

Thermodynamics and transport phenomena
Several data-driven models have been employed to predict phase equilibrium and transport phenomena coefficients for various chemical systems [17]. Indeed, these fields already have some empiricism in their standard mathematical formulations. For example, flash algorithms have some empiricism when using binary interaction parameters in subjective mixing rules [18], and the majority of transport phenomena coefficients are estimated from empirical correlations, sometimes questionable [19]. Therefore, the use of ANNs is a better way to find functional relationships between the model variables instead of first determining these constants [20].
Moreover, ANNs reveal a conceivably faster choice to those property prediction calculations in process simulations, limiting process control applications that require to be conducted in real-time. For this, Poort et al. [21] studied the replacement of conventional Equations of State (EoS) for property and phase stability calculations on a binary mixture of methanol-water. They trained ANNs with data generated through the Thermodynamics for Engineering Applications (TEA) to represent four kinds of flash algorithms, leading to an enhancement of 15 times for the predictions of properties and 35 times for classification of the phases.
Also noteworthy is that ANNs have also been used to predict if a particular mixture forms an azeotrope -essential information to design and to control a separation process. Alves et al. [22] successfully developed an ANN classification model to determine whether binary mixtures can exhibit (or not) azeotropy based solely on the properties of pure components as input variables. Therefore, it shows the power of ANNs for this type of thermodynamic evaluation since it does not take into account the non-ideality of the mixture. They are also widely employed to predict thermal-physical properties of ionic liquids, such as density and viscosity [23]. The primary source of these values comes from experiments at the laboratory since ionic liquids do not present a universal description of their phase behavior. For example, using the definition of group contribution and the operating temperature, Valderrama et al. [24] successfully developed a three-layer FF-ANN to estimate the density of ionic liquids.
ANNs have also been employed in statistical thermodynamics techniques, which compute physicochemical properties from molecular simulations. One of these methods -the High-Throughput Force Field Simulation (HT-FFS) -can generate large volumes of data. ANNs can be trained with these data, thus building a graybox model to improve the property predictions with a lower computational effort [25]. They have also been used in Density Functional Theory (DFT) calculations to replace some physical functionals with data-driven ones, finding the energy levels for electronic structures of different compounds with a balance between computational cost and accuracy [26].
Regarding their application to Transport Phenomena, it is well-known that ANNs -as an excellent universal approximator for any nonlinear function [27] can be used for estimating convective heat-and mass-transfer coefficients [17]. Mainly in situations in which there is no mathematical correlation that can adjust them, as is the case of bubble columns. For this, Verma and Srivastava [19] successfully built an ANN model from literature data with eight inputs related to the system configuration of a bubble column (gas velocity, Prandtl number, number of holes, hole diameter, column diameter, surface tension, gas holdup, and bed height) and one output (heat coefficient). Table 1 displays a summary of the current applications of neural networks to thermodynamics and transport phenomena discussed above. In the table, we specify the field, case study, class of neural network, activation function, topology and software used in each work.

Kinetics and catalysis
Neural networks have been successfully applied to catalysis to determine the relationship between the catalyst structure and its activity [8]. As heterogeneous catalysis has developed increasingly efficient experimentation techniques, the number of new data have increased exponentially [28], both from synthesis and from characterization and catalytic tests [29]. Thus, there is a need for more adequate tools to manage these large amounts of experimental data, to understand and to model it, and to generate a way to optimize the catalytic performance [30].
Two types of ANNs applications have been described so far in the frame of combinatorial catalysis: (i) ANN catalyst compositional models, correlating composition and synthesis variables with catalytic performance, and (ii) ANN kinetic models, correlating reaction conditions with the catalytic performance [31]. For example, those applications include the design of ammoxidation of propylene catalyst [32], design of methane oxidative decoupling catalyst [33], analysis and prediction of results of the decomposition of NO over zeolites [34], among other studies. Also, ANNs have been used combined with genetic algorithms for designing propane ammoxidation catalysts [35]. Another work successfully reported the viability of ANNs in the analysis and prediction of catalytic results within a collection of catalysts produced by combinatorial techniques [36]. Recently, an ANN was applied to estimate the rate of dehydration reaction of methanol in dimethyl ether synthesis [37]. The results showed that an ANN is a powerful tool for evaluating the reaction rate instead of using sophisticated kinetic model equations. The number of publications in this catalysis field has had an upward trend, especially in the last decade with the high demand for practical applications of the concepts of Big Data. The group of Turkish researchers led by Günay and Yildirim has excelled with work in the field, using not only ANNs for extracting knowledge from catalytic data, but also decision tree algorithms to determine the heuristic conditions and rules that lead to a high performance of the catalyst. For example, in work about carbon monoxide oxidation over Cu-based catalysts, they successfully used 1337 data points from 20 studies for evaluating catalyst performance using ANNs [38].
In the field of heterogeneous catalysis, ANNs can be used to select better possible catalysts -cheaper, less toxic, and composed of non-precious metals -for a given reaction, thus reducing the massive number of needed high-throughput experiments, peculiar conjuncture of combinatorial catalysis [39]. In this direction, Cavalcanti et al. [40] used a three-layer feedforward neural network to predict the ideal composition of the catalyst in the water-gas-shift reaction and discover useful trends through sensitivity analysis. The input variables for ANN were several, while the only output variable considered was the conversion of CO. The model for the reaction was successfully developed, exhibiting the power of ANNs for predicting better catalysts and operating conditions for the process.
Recently, Cavalcanti et al. [8] showed that ANNs are able to predict the variables that most influence the conversion of CO in the water-gas-shift reaction, that is, temperature and surface area. The results can be used to conduct subsequent research in an optimized manner in this area, as it aims at the well-managed use of environmental resources, in the sense of selecting efficient catalysts for producing hydrogen -a clean energy source.
In the same topic, Garona et al. [41] presented an empiric model for the Fischer-Tropsch Synthesis (FTS) reaction using ANNs. A database of FTS to light olefins was assembled from the literature, and feedforward neural networks were used to build more complete models, which helped to predict optimal catalyst composition and operating conditions.
It is also noteworthy that ANNs were also used to model the sintering of a catalyst in a dry reformer [42]. In particular, the effects of temperature, pressure, and catalyst diameter on methane and CO 2 conversions, H 2 /CO ratio, and molar percentage of solid carbon deposited on the catalyst (responsible for deactivation) have been studied. The ANN design activity was automated using a Genetic Algorithm (GA) search over the set of possible network topologies. The inclusion of the effective number of parameters in the GA objective function led to networks that performed well over testing data points.
Another application is in the determination of acidity in zeolites with data from FTIR spectroscopy [43]. FF-ANNs were used for analyzing multivariate base on the characteristic absorbance of 11 zeolite samples after metal substitution (Zn, Cu, Ga, and Ag) in the ~3612 cm −1 region. The developed regression method presented the same results of acid sites from other conventional and expensive methodologies.
Thus, in order to formulate a new kind of catalyst, it is essential to identify the catalysis past [44]. Therefore, by using ANNs, it is possible to convert historical data from past publications into valuable information, leading to a great acceleration in the development of new catalysts with better performances for a given process [8]. Table 2 presents a summary of the current applications of neural networks to catalytic processes.

Process analysis and optimization
The applications of neural networks to the process analysis are increasing. Assidjo et al. [45] modeled the drying process of the production of coconut using a neural network. The goal is to predict the moisture of dried grated coconut whose dynamics are not well known. The authors used a feedforward fully connected neural network, whereby the selected architecture was 9-4-1, selected based on the minimum error in the test set. The results indicate that the neural network proposed, constructed using industrial plant data, can be used as a predicting method.
Fernandes and Lona [46] applied neural networks to the field of polymerization. The authors also highlighted some topologies, the number of data points needed, and the concept of stacked neural networks that can enhance the prediction of the final model.
Alves and Nascimento [47] used industrial plant data for constructing neural networks to detect gross errors; the case study was an isoprene unit facility.
Alves and Nascimento [4] studied the production of high purity isoprene from a C 5 cut arising from a pyrolysis gasoline unit. The first principle models were replaced by neural networks in the final grid search of the optimal parameters for Application of Artificial Neural Networks to Chemical and Process Engineering DOI: http://dx.doi.org/10.5772/intechopen.96641 the process. The set of 10 neural networks were defined to represent the whole flowsheeting, whereby the number of hidden layers was defined by the minimum error in the test set. Lastly, the framework successfully optimized a chemical plant under study using neural networks with industrial data.  Khezri et al. [15] proposed a hybrid model for optimizing a large-scale gas to liquids process. The dataset was constructed using a simulation model of the GTL process. Different topologies were compared to select the most promising one; one and two hidden layers with different number of neurons were tested. The optimal configuration was two hidden layers with 7 and 15 hidden neurons each. The ANN was modeled using the information of the tail gas unpurged ratio, recycled tail gas to FT ratio, H 2 O/C in the syngas section, and CO 2 removal percentage as input features; the outputting was wax production rate. The ANN model was then used for optimization purposes.

References
Wang et al. [16] proposed a framework for predicting the operating trend of an industrial process. The framework contains three major steps: (i) multivariate correlation analysis, to deal with the correlation between the historical industrial data, (ii) clustering, due to nonlinear dense data and unclear operating trend types and (iii) a convolutional neural network (CNN), formed by five parts (input layer, convolutional layer, ReLU layer, pooling layer, and fully connected layer).
The authors pointed out the importance of the convolutional networks to extract important features from the dataset. Moreover, the advantage of such a framework was compared with traditional convolutional neural networks and recurrent neural networks (RNNs) for a methanol production process.
Cai et al. [48] analyzed an industrial process using data-driven models. The case study was the industrial reverse osmosis concentrate (ROC) treatment with the fluidized bed reactor Fenton (FBR-Fenton) process. Prior to modeling, a statistical analysis was carried out to determine the most relevant features as input (Fe 2+ dosage, H 2 O 2 dosage, pH, and HRT). Two approaches were studied, ANN and linear regression. The former showed more accurate predictions, consisting in one input layer (4 neurons), 4 hidden layers (10 neurons each) and one output layer (2 neurons) using ReLU as an activation function, due to the least computationally dense mechanism and also a general approximation for most scenarios [11].
The crystallization process and the quality of the products was studied by Lin et al. [49]. The authors used a Raman spectrum as input for a two-layer back propagation neural network with four hidden neurons to predict the solution concentration and slurry density simultaneously. They also compared the output prediction of the neural network with other algorithm predictions (characteristic peaks regression, principal component regression, partial least-squares regression), and the results indicated the superior prediction characteristics of the neural network due to its inner nonlinear nature.
Chemical process synthesis is a complex scheme, which comprises process modeling and design, and combinatorial defiance. There are two major approaches: the traditional sequential form and the optimization-based synthesis using superstructure models. In the former category, the problem is solved in sequential scheme, by decomposition whereby there is a hierarchy of elements that can be depicted by an Onion Diagram (reactor, separation, heat recovery and utility) [50].
The latter category considers the full integration between decisions at the single step, i.e. determine the optimal structure and operating conditions simultaneously. Therefore, this approach contemplates all possible complex interactions between the engineering choices, including equipment (potentially selected in the optimized flowsheet), the interconnection and operating conditions formulated as an optimization problem [51][52][53].
There is a diversity of proposed methodologies to represent a general process superstructure [54][55][56]. However, due to the inner complexity of the superstructure (Figure 1), the large-scale non-convex Mixed-Integer Nonlinear Programs (MINLP) require effective approaches to solve them. The use of simplified models or surrogates at the unit operation level is advantageous because they are present in any process simulator. Additionally, surrogates can be used to represent an entire subsystem consisting of a definite number of units. Artificial Neural Networks (ANNs) may be used to generate the surrogate models, due to their fitting characteristics [57].
In order to circumvent the solution problem of a superstructure, Henao and Maravelias [58] proposed a framework to replace complex unit models (based on first-principle) with surrogate models, developed using artificial neural networks. The authors proposed simpler surrogate models for pumps, compressors and flash vessels. The authors used two case studies (Absorption-based CO 2 capture system and maleic anhydride process superstructure) to validate the proposed framework.
The results indicate the possibility of using neural networks embedded in a rigorous optimization procedure.
Savage et al. [59] proposed a hybrid machine learning-based framework to optimize the chemical process (the CryoMan Cascade cycle system was used as a case study). The authors compared different surrogate models algorithms (ANN and Kriging Partial Least Squares); the results indicated a reduction in the time needed for the optimization when compared with the rigorous model. Moreover, they found that a single large ANN was unable to capture the high nonlinearity of the process under study based on the final accuracy. Therefore, the authors broke the surrogate model into a series of parallel sub-models, revealing to have increased the final accuracy.
According to Klemes et al. [60], despite the substantial level of maturity of the process modeling, the nature of connections of the problem still allows improvements. Nascimento et al. [61] also presented alternatives for the optimization of industrial facilities using neural networks and compared them with industrial data. Table 3 presents a summary of the current applications of neural networks to process analysis and optimization.

Process safety and control
One of the most common applications of ANNs to the area of process safety and control is in fault detection and diagnosis. These systems are built to identify habitual process behavior and recognize atypical variations in the chemical plant that can lead to an accident [64]. Generally, deep neural networks -ANNs that contain several hidden layers -are used to extract spatial and temporal aspects of the data for this purpose [65]. Their inputs are the sensors responsible for the variable measurement, and their outputs of the kind of faults (e.g., tube plugging, valve blockage, catalyst deactivation, among others) [66]. However, determining the various hyperparameters of deep neural networks demands a considerable amount of time, which is not suitable for fast online process applications. Based on this, Peng et al. [67] applied a method to reduce the training time of these complex types of network architecture: the Broad Learning  System (BLS). It uses an incremental learning procedure and enlarges the network in width, making a quick training stage possible. They successfully employed this strategy in a batch fermentation process for fault detection utilizing the Affinity Propagation (AP) algorithm in a Long Short-Term Memory (LSTM) deep neural network to cluster distinct stage data.
Another use is in developing models to control the process quality through variables that do not have online sensors. On the one hand, variables such as pressure, temperature, and mass flow rate can be easily measured by manometers, thermocouples, and mass flow controllers, respectively. On the other hand, the online measurement of a variable such as pH in the process is a challenge since no large-scale equipment exists for this, depending on an offline laboratory analysis. Therefore, ANNs can be used to develop these so-called soft-sensors to predict quality parameters from a large volume of industrial data, improving the process control quality [68].
Finally, ANNs are also used to replace complex phenomenological models in Model Predictive Control (MPC) architectures and Real-Time Optimization (RTO) strategies [69]. Both applications depend on the model accuracy and the velocity of solving the model equations to drive the controlled variable to the desired set-point. The former is related to dynamic processes and the latter to steady-state operations [69]. Since ANNs have a lower computational response than first-principle models, they are a suitable alternative to make these control strategies possible and efficient.
A successful application of this kind of substitution can be found elsewhere [70], in which an ANN is used to replace a very detailed computational fluid dynamic (CFD) model that represents the synthesis of phthalic anhydride in a fixed-bed catalytic reactor for an MPC structure. Moreover, a hybrid model approach (first-principles combined with ANN) was employed in an MPC by Zhang et al. [69] to drive a reaction process in a continuous stirred tank reactor (CSTR) to optimal operating conditions. They represented the reaction rates by neural networks instead of using the nonlinear Arrhenius Law to describe the reaction phenomenon. Indeed, this well-known equation was used to generate the dataset for training the network under numerous variations in temperature and reactant concentrations. The MPC acted to stabilize the chemical process, driving it to the lowest total cost conditions. Wu et al. [14] proposed a hybrid machine-learning model that incorporates first principles into a recurrent neural network. The authors studied two models, a partially-connected RNN model and a weight-constrained RNN model and applied them to a chemical process containing two well-mixed, non-isothermal continuous stirred tank reactors in series. The two proposed models outperformed a Lyapunovbased model predictive controller based on prediction accuracy, smoother state trajectories and economic advantages.
It is worth mentioning that ANNs are being used to build detectors to prevent cyber-attacks against process plants [71]. Nowadays, with highly automated systems for controlling chemical plants with real-time operation, breaches in cyber-secure failures can exist, which may cause accidents and economic losses. With this in mind, Chen et al. [71] developed a feedback-MPC control architecture with an ANN-detector that can identify the probabilities of cyber-attacks in networked sensors. Therefore, the applicability of ANNs in these safety and control strategies is very significant for the integrability of industrial plants. Table 4 shows a summary of the current applications of neural networks to the area of process safety and control.

Future works
Today, ANNs are one of the most found subjects in the scientific literature of Chemical and Process Engineering; and their use tends to continue growing. This can be explained by the launch of Industry 4.0, in which these data-driven models play an essential role in the implementation of some type of intelligent systems in processes [72]. Thus, to remain relevant in this current scenario, companies need specialized professionals on their team. For this reason, this topic has been introduced into the curriculum of most Chemical Engineering degree programs [73].
Indeed, the continuous availability of large volumes of stored data in industrial processes will lead to the development of new ANN approaches for process modeling and data interpretation. These models will deliver more direct relationships between cause and effect variables for process optimization and control through MPC strategies. Therefore, the automation of entire plant units will conduct to intelligent processes, capable of making decisions for safer operation, and with a reliable protection system against cyber-attacks.
Another subarea worth mentioning for future developments is the design of new materials. The use of ANNs has led to a decrease in the number of lengthy and   costly laboratory experiments for analyzing the performance of polymers, ceramics, glasses, and mainly, catalysts. Therefore, it is possible to convert data from past publications and from high-throughput (HT) experiments into information, leading to a surprising acceleration in developing new materials with better performances for a given process.

Conclusions and perspectives
This chapter presented the ANNs and their Chemical and Process Engineering applications, showing how they have become a powerful tool for modeling chemical processes. This analysis also showed their increasing application, helping to understand and analyze process data features for future research in thermodynamics, transport phenomena, kinetics and catalysis, process analysis and optimization, and process safety and control.
The prospective availability of large volumes of data with good quality will make ANNs one of the most used methods to represent a process, estimate thermodynamic properties, develop new catalysts, replace complex phenomenological models, and improve control and safety strategies. Moreover, in real chemical processes, a particular part of the inputs affect only a section of the outputs. Therefore, the knowledge of first principles embedded in a data driven machine learning model is a challenge for the next studies.