Data Analysis and Modeling Techniques of Welding Processes: The State-of-the-Art

Information contributes to the improvement of decision-making, process improvement, error detection, and prevention. The new requirements of the com-ing Industry 4.0 will make these new information technologies help in the improvement and decision-making of industrial processes. In case of the welding processes, several techniques have been used. Welding processes can be analyzed as a stochastic system with several inputs and outputs. This allows a study with a data analysis perspective. Data mining processes, machine learning, deep learning, and reinforcement learning techniques have had good results in the analysis and control of systems as complex as the welding process. The increase of information acquisition and information quality by sensors developed at present, allows a large volume of data that benefits the analysis of these techniques. This research aims to make a bibliographic analysis of the techniques used in the welding area, the advantages that these new techniques can provide, and how some researchers are already using them. The chapter is organized according to some stages of the data mining process. This was defined with the objective of highlighting evolution and potential for each stage for welding processes.


Introduction
One of the most important processes of joining metals is welding process, like the one that appears in [1]. It is used in simple structures fabrication, nuclear and petroleum industries, as well as chemical components.
In a typical fusion welding process of metals, such as resistance welding, arc welding, electron beam welding, laser welding, a heat source is applied locally to the interfaces of the two metals to be joined. The interface can be metals' surfaces, where faces of each other are joined by a nugget (e.g., spot or resistance welding). In arc welding, the interface will be the weld seam. However, complex physical phenomena and processes occur due to the heating/melting and cooling/solidifying. This may produce adverse effects on weld properties and base metal properties [2]. In order to reduce adverse effects and obtain desired results, many studies have been developed to monitor, predict, or control welding processes. All these studies are based on the optimal welding parameters' adjustment, but all of these are adjustable.
All adjustable welding parameters, such as current or current waveform, heat input, wire feed speed, travel speed, and arc voltage, may be used as system inputs and be designed to assure the required outputs. For that reason and the interrelations-parameters complexity, welding process can be analyzed like a stochastic system, which has input and output parameters and several disturbances [2]. Chen's article [3] was related to the need to improve the information acquired from these welding parameters and identify characteristics in order to improve and control the welding process results. Chen defined new objectives of modern welding manufacturing technology to show the way for better welding processes. It exposes some problems of the intelligentized welding manufacturing technology (IWMT), which are shown in Figure 1.
Other science areas present the potential to solve these problems. Computer science areas have had great results with new technique applications of data analysis, learning models, and intelligent control. Data analysis objective indicates nontrivial features on a large amount of data. Due to the increase and complexity of data, more efficient data analysis techniques have been developed. Welding process can be analyzed with this point of view. So, the welding process analysis with new techniques is nothing more than a continuity in the development of welding analysis processes. This interdisciplinarity is one of the necessary contributions proclaimed by the so-called Industry 4.0, like the one shown in [4][5][6].
The fourth industrial revolution refers to the next manufacturing generation, where automation technology will be improved by self-optimization and intelligent feedback [7]. For this reason, the application of the most recent data analysis techniques and processes can contribute to a better control and monitoring of welding processes. These techniques can be joined in machine learning techniques [8][9][10][11][12], data mining process [13][14][15][16][17], and control process [18][19][20]. The interrelation of these areas and their origins are presented in Figure 2. Machine learning is a growing area in computer science, with far-reaching applications, for data analysis [21]. Machine learning uses computer theory and statistics for building mathematical models with the goal of making inference from a sample [22]. One branch in machine learning with fast growing is deep learning. These methods are an essential part of the research on speech recognition in the state of the art [  processes define several stages and methodologies to achieve these objectives, as exposed by Marbán in [34]. An important objective of data analysis is to reveal and indicate diverse, nontrivial features in data. For this reason, welding process can be analyzed with this point of view.
A search conducted in the Web of Science from 2011 to October 3, 2018, shows the growing trend of these new data analysis techniques and processes in welding process researches Figure 3, but when comparing with the investigations on models welding process, growth is almost imperceptible, as appearing in Figure 4.  These demonstrate the need for this review to show these techniques, the advantages in their applications, and the increasing trend of their utilization. This review can be resumed in following stages: 1. Welding process-understanding of welding processes being analyzed.
2. Sensors-analysis of some principal sensors in welding process.
3. Data processing-analysis of technique to transform sensors information to welding process dataset.
4. Modeling welding process-analysis of some modeling techniques in welding process.
5. Intelligent control of welding process-analysis of some intelligent control techniques in welding process.
These stages has a close relationship with data mining processes as a sample [34].

Welding process
American Welding Society (AWS) definition for a welding process is: "a materials joining process which produces coalescence of materials by heating them to suitable temperatures with or without the application of pressure or by the application of pressure alone and with or without the use of filler material" [36].
AWS defines groups of welding techniques depending on the energy transfer mode. The processes analyzed in this chapter are grouped as shown in Table 1.
These groups present different parameters and characteristics that were analyzed in the articles presented in this chapter.

Arc welding
The group arc welding is characterized with electric arc. The electric arc is the heat source most commonly used in fusion welding of metallic materials. The welding arc comprises a relatively small region of space characterized by high temperatures (similar to or even higher than the sun's surface), strong generation of light and ultraviolet radiation, intense flow of matter, and large gradients of physical properties. It has an adequate concentration of energy for localized base metal fusion, ease of control, low relative cost of equipment, and an acceptable level of health risks to its operators. The study of the arc is of special interest in areas such as astrophysics and the electrical and nuclear industries [37]. The electric arc generates a complex interrelation of thermal, electrical, and magnetic parameters. These are hampering much of their studies based on definite theoretical formulations. Despite many studies, the electric arc is quite complex and the knowledge so far allows a partial understanding of the phenomenon [1].

Resistance welding
Resistance welding is the joining of metals by applying pressure and passing current for a length of time through the metal area that is to be joined. Its principal advantage is no other materials are needed to create the bond; this reason makes this process extremely cost effective. Resistance welding is applied in a wide range of automotive, aerospace, and industrial applications. Among the main parameters are welding time, welding force, contact resistance, materials properties [1]. Resistance spot welding, like all resistance welding processes, creates welds using heat generated by resistance to the flow of welding current between the faying surfaces, as well as force to push the workpieces together, applied over a defined period of time. Resistance spot welding uses the electrode face geometries to focus the welding current at the desired location, and apply force to the workpieces. Once sufficient resistance is generated, the materials set down and combine, and a weld nugget is formed [36]. The process is fast and effective, and it is also complicated due to complex interactions between electrical, mechanical, thermal, and metallurgical processes. The heat generation in RSW is due to the resistance of the parts being welded to the flow of a localized electric current, based on Joule's law. The quality of the joint in RSW is influenced by the welding parameters. These parameters mainly include welding current, welding time, electrode force, and electrode geometry [38]. Large scale resistance spot welding (LSRSW), as mentioned in Table 1, is generally adopted in the automotive industry. It is an automotive structure that includes thousands of spot welds. It presents the same parameters and complexity as RSW; only the parameters related and influenced by its scalability are increased [39].

Other welding processes
In this group, AWS presents various welding processes. Laser welding is the only one belonging to this group, which is found in the analyzed articles.
Laser beam welding is one of the most technically advanced welding processes. Laser welding is in general a keyhole fusion welding technique which is achieved with the very high power density obtained by focusing a beam of laser light to a very fine spot [40]. This light ray heats metals up quickly so that the two pieces fuse together into one unit. The light beam is very small and focused, so the metal weld also cools very quickly. Laser welding operates in two fundamentally different modes: conduction limited welding and keyhole welding. The mode in which the laser beam will interact with the material is welding; it will depend on the power density of the focused laser spot on the work piece [41].
Other parameters that are present in these processes are those of final welding geometry, which behave differently in different processes and under different conditions. The parameters of the respective sources generate their influence on the final result of each welding process.
Welding is a complex process, so it requires more intelligent techniques in its analysis, monitoring, and production quality improvement. The use of sensors allows the acquisition of process parameters. The new artificial intelligence techniques will allow a better study, modeling, and control of these processes.

Sensors
Several sensors have been applied in the welding process for monitoring. The weld bead and the weld-pool indirect sensing technologies can be classified like exposed in [42] and in Figure 5.
Infrared vision techniques have been widely applied in the welding process [43-50]. One of the problems of this technique is that the environment where it is applied can interfere in the precision of the data obtained from a process. This may be due to the own heat emission of the technologies utilized.

Sound sensor
Sound may indicate conditions that generate weld defects. Acoustic information plays a relevant role for expert welders, as described in [51]. Sound signature produced by GMAW contains information about arc column behavior, the molten metal, and the metal transfer mode. High-speed data acquisition and computeraided analysis of sound signature may indicate conditions that generate weld defects [52,53]. Di Wu, in 2016 [54], tried to monitor penetration and keyhole with acoustic signals and image analysis. Lv et al. [55], proposed a recognition model to analyze the relationship between penetration state and arc sound. In 2017, Lv et al. [56] again presented a welding quality control in pulse gas tungsten arc welding (P-GTAW). The welding acoustic signal was used to analyze the design of an automated welding penetration control system.
In welding, it is easy to capture sound, but it is very difficult to analyze the noises and differences of intensities that are sometimes generated. This is not a problem to sound deep learning technique like present [32,57]. To understand the welding sound analysis with deep learning techniques, it is necessary make an image arc correlation to know what happens in welding arc.

Vision sensor
Vision sensor is largely utilized in welding process to analyze weld-pool process [58,59], arc-welding process [60,61], and weld bead geometry [62,63]. The more light generated by arc can be difficult for the image obtention. Some techniques are utilized. One of them was utilized by Chen in 2010 [64].
He made monitoring and control of the hybrid laser-gas metal arc welding process with an economical sensor system, and a coaxial vision system, which was integrated from a relatively inexpensive industrial vision system and a personal computer (PC). Another visualization technique is Shadowgraphy, applied in Esdras Ramos investigation, in 2013 [65,66]. This is based on process shadow arc with laser source.
In [60], a laser illumination was utilized. To reduce the arc light, a narrow band interference filter was applied. For precise measurements, an image-analysis technique was used. This technique can be used to obtain high quality images but only it can be used in processes without material transfer.
Chen et al. [67] utilized a visual double-sided sensing system. In one frame, the weld-pool geometry parameters in GTAW process were determined.
With high speed illumination laser in [68], great quality images are obtained. This technique is more recent one but it needs a laser with more potentiality than Shadowgraphy technique. This technique is more expensive too.

Data processing
Some papers define their own image processing technologies, like Hong Yue in 2009 [69], where the weld image processing adopts the classic techniques such as Laplacian, Gaussian, neighborhood mean filters, and threshold segmentation. Yanling Xu, in 2014 [70], proposed the Canny edge detection algorithm for detecting edges and extracting pool and seam characteristic parameters. Qian-Qian Wu in [71] researched to find out the optimal algorithm to filter. He made a comparison of Wiener filter, Gaussian filter, and Median filter on welding seam image. In the classic image processing, it is very difficult to generalize a filter or algorithm, because it depends on the conditions and characteristics of camera parameters and light.
Another problem with these algorithms mentioned above is that the real-time analysis has an insufficient response time to be utilized in a process control despite recent developments in computational resources.
Deep learning techniques have efficient result in real-time executions [28] and classifications [24, 25] despite classifications on new images. One example applied in welding process is [62,63]. It utilizes autoencoder deep learning technique to extract features of images process in laser welding. Another example of recent application of deep learning technique is [72]. It presents a method based on deep learning aims to extract information from photographs on spot welding. This monitoring system on the spot welding productive line shown better performance than the previous images analysis.
Not focused on welding arc analysis, but with good results, the work [73] proposed an automatic detection for weld defects in X-ray images. A classification model on deep neural network was developed. The accuracy rate of the proposed model was 91.84%. This was one more example of the potential of these techniques in welding area on images processing.

Modeling welding process
Today's manufacturing environments has a rapid advancement on demand for quality products. Many techniques and methods are applied to correlate between process parameters and bead geometry. One of them is response surface methodology (RSM). It was applied by Sen in 2015 [74]. He made to evaluate the correlations between process parameters and weld bead geometry in double-pulsed gas metal arc welding (DP-GMAW). Santhana Babu [75] with the same technique got good results for predicting and controlling the weld bead quality in GTAW process. The problem of this method is that the researcher can find the equation, called response surface, by test and error. This can be very difficult. Many theoretical models have been defined to determine the process that occurs in the welding arc, including [76]. The main problems of these models were that they lose precision because it was very difficult to obtain a formula that contains all the complexity of these processes, as well as affirmed by Hang Dong in [77]. Mathematical models, based on machine learning techniques, have better results in problems as complex as this one. In the same paper, Hang Dong expressed the potential of these models.
One of the well-known and utilized regression algorithms is the least squares method. It was utilized in [78] to predict the seam position under strong arc light influence. Other work is [79] a LR model that is utilized to analyze the pool image centroid deviation and weld based on visual weld deviation measurement in GTAW process. The other technique is Gaussian process (GP) regression (GP), which was utilized in [77] to predict better performance in arc welding process of GTAW process.
An interesting method, utilized in [80], was Mahalanobis Distance Measurement (MDM). It was employed to determine welding faults occurrences. The same method was utilized in 2017 by Khairul Muzaka [81] on GMAW process to optimize welding current on a vertical-position welding. One problem of this method is that only correlate in function one input.
Bai and Lubecki [82] proposed a Localized Minimum and Maximum (LMM) analysis method in real time for welding monitoring system. The problem of LMM is that it exposes a simple function to measure the quality than not defining the complexity of the system. That is why, this work is limited only to the short-circuit transfer mode.
In 2017 by Junheung Park [83], a SVM was proposed with bootstrap aggregating that reduced the noisy on RSW data with computational efficiency. In this framework, other techniques as Generalized Regressive Neural Networks (GRNN) and Genetic algorithms for optimization were joined. This article demonstrates an increase in more complex computer science techniques for better analysis of welding processes. But the only way to know if all this was necessary is comparing with other techniques.

Artificial neural network models
Some researchers already had this reference of advantages of these algorithms. Bo Chen in 2009 [84] utilized ANN to training the experimental obtaining data. The good result of ANN prediction was validated by D-S evidence theory information fusion. They have also been utilized for different purposes and in different welding processes such as in SAW process [85] and GMAW cold metal transfer (CMT) process [86], for predicting weld bead geometry; in GTAW process, for predicting the angular distortion considering the bead geometry [87]; in girth welded pipes process, for predicting residual stresses [88]; and in underwater wet welding process, for predicting the weld seams, geometric parameters [89].
For better results, ANNs have been mixed with other techniques. One example is [90], where ANN and Support Vector Machine (SVM) are utilized for welded defect detecting and monitoring on a laser welding process. The other technique is by Bo Chen and Shanben Chen [91] for predicting the penetration in GTAW process. But they used different ANNs to process information from different sensors, and finally, they used the predictive fuzzy integral method.
Another example is [92], for predicting bead height and width in GMAW process using ANN Fuzzy ARTMAP, like monitoring task.
The increase in computational resources has allowed an increase in the complexity of ANN architectures. These are called Deep Neural Networks (DNN). They, bit by bit, begin to be applied in the welding process. One of them utilized was in [93]. The model is based on a DNN architecture to make a study of the estimation of weld bead parameters. This article mixed data from different welding processes. This is a risk for results analysis since different processes can have different outcomes with the same input parameters.
Rao et al. [94] utilized Generalized Regressive Neural Networks (GRNN) technique for estimating and optimizing the vibratory assisted welding parameters to produce quality welded joints. But in this case, it does not have comparison with other algorithms.
Di Wu, in 2017 [95], wrote a paper that addresses to perform Variable Polarity Plasma Arc Welding (VPPAW) process. Deep Belief Network (DBN), DNN variant, and t-Stochastic Neighbor Embedding (t-SNE) were studied for monitoring and identifying the penetration values. Experimental comparisons and verifications expose better performance for DBN, 97.62% exactly. This reaffirms the good results offered by the learning models developed with these algorithms. This work does not take the advantage of DNN algorithms to analyze both images and sound in real time. Figure 6 shows a summary of articles analyzed. It shows that ANNs are one of the most used techniques, but they do not always offer the best result. This demonstrates the need to make comparisons between various modeling techniques in order to define the best result, in terms of efficiency and computational cost.

Comparison of different models
As it has been expressed in the previous sections, there are new techniques to analyze very complex systems. But they require expensive computational resources for their construction and sometimes for their execution. A comparison between models will allow to know which model has better results and which model can be the most effective to be utilized. This effectivity is measured in function of problem necessity, like the one shown in data mining (DM) methodologies and processes [16,17].
An interesting comparison is Support Vector Machine (SVM) and ANN model, to identify weld groove state and weld deviation extraction in rotating arc narrow gap MAG welding (RANGMW) [96]. It presented SVM models with better results than ANN model.
One comparison with focus on time optimized was [97]. It utilized an ANN and ANN with differential evolutionary algorithm (DEA) separately. The results obtained by ANN using DEA were closer to ANN, but the computational time of ANN using DEA was shorter.
In the article [98], Response Surface Methodology (RSM) was compared with linear isotonic regression, regression (LR), regression trees, ANN, GP, and SVM, to evaluate mechanical properties in GMAW process. The results present that the DM models have poorer generalization on this research, because DM techniques require, to obtain acceptable results, a large amount dataset.
Sumesh in 2015 [99] compared Decision Trees (DT), ANN, Fuzzy Logic, SVM, and Random forest technique Weld Quality Monitoring in SMAW. The most efficient technique was Random forest. This shows that not always the most complex techniques offer the best results.
One of the few comparative analysis algorithms is Kumar's paper in 2016 [100]. This paper explores Self-Organizing Maps (SOM) using as a mechanism for performing unsupervised learning, for comparing performance characteristics of various welding parameters which include welding power supplies and welders. Results obtained using SOM has been compared with the Probability Density Distributions (PDDs) obtained during statistical analysis. Voltage and current data analyzed using the SOM technique can also be utilized to evaluate the arc welding process. These studies demonstrate that there are other potential algorithms for welding process analysis. For that reason, it is necessary to evaluate and compare several of them to be agreed upon in a real-time process.
Other comparison in 2016 by Di Wu is [54]. The article compared a prediction model for Plasma Arc Welding based on Extreme Learning Machine (ELM) with ANN and SVM techniques. The ELM model had better generalization performance and was faster than others. This potentiality was established too by Nandhitha in

Author
Year Welding process  [102] discusses that in Resistance Spot Welding (RSW) process. He examined the prediction performance with GRNN and k-Nearest Neighbor (kNN) algorithms. The results indicate that with smaller k of kNN, the prediction performance measured by mean acceptable error has increased.

Author
Year Welding process  Table 3. Other quality welding article was Xiaodong Wan in 2017 [103]. It proposed a Probabilistic Neural Network (PNN) model for quality prediction in large scale RSW process. In this case, the PNN model was more appropriate in quality level classification than the Back Propagation Neural Network.
The one of the last articles with direct DM techniques and welding relation is of Yiming Huang in 2017 [104]. This is an investigation of porosity on pulsed gas tungsten arc welding (P-GTAW) with an X-ray image analysis. To detect, an Empirical Mode Decomposition (EMD) and Spectral Analyses were made based on DM.
In 2017, Petković [105] predicted the laser welding quality by training data for the computational intelligence methodologies and support vector regression (SVR). SVR is a novel variant of SVM for regression task. This article made a comparison between SVR, ANN, and GP. It is another example that in certain problems, less complex algorithms can offer better results. Table 2 presents a series of articles that were based on the monitoring and quality of the welding processes. The column Preparation defines the technique of processing the data obtained by the sensors; Classic for processes that do not use the latest techniques of image processing and DL for the use of deep learning; Online defines if the model was executed in real time; Compare, if in the research carried out in the article, a comparison is made between several algorithms; and Modeling defines the algorithms used in specific article. When a comparison exits, the first model before coma was the best quality result. As Tables 2-4, the best algorithm does not always match.
Defining which of the techniques is more effective for our problem also helps in the effectiveness of a future process of intelligent control.

Author
Year Welding process

Intelligent control of welding process
The intelligent control approach offers interesting perspectives since it is able to provide methodologies that allow to perform automatically some of the tasks typically performed by humans [110]. This combines with data mining models.
One intelligent control tendency utilized is a fuzzy method with ANN model. Example of this was [111] on GTAW process for predicting the dynamic of the weld pool; and in [112] for GMAW pipe-line welding, to improve the welding quality.
Another example was [113], on GMAW process, for modeling and control of weld bead width. Other example of fuzzy methods but different model techniques was [114]. It was applied for better control purpose of bead geometry parameters in submerged arc welding (SAW) process. This article proposed the response of a fuzzy logic approach with surface methodology (RSM). Demonstrating that any model obtained from a welding process can be integrated into a control system. As long as it meets time demands.
Conventional and intelligent control methods were investigated by [67] in P-GTAW process. This work made a comparison with PID control, fuzzy control, and neuron self-learning PSD control. It had better performance. This article highlights the advantage of learning-based control.
Other optimization based in learning was [115]. It proposed ANN model with a Particle Swarm Optimization (PSO) algorithm to optimize weld bead geometry characteristics on the GMAW process. The ANN-PSO model obtained an efficient optimization and multi-criteria modeling.
An emerging learning-based control system was used by Günther in [62,63] for laser welding control. This technique is called reinforcement learning (RL). It is a machine learning branch. It is focused on decision-making by learning process [116]. Control learning can be an optimization-based method like Q-learning algorithm. It can be used to solve optimal control problems like expressed in [117].
Günther's study [63] is one of the few RL studies for laser welding system. This makes this work an important contribution to welding process engineering. RL is a new technique open now in welding process with noble success in other areas like appearing in [118][119][120].

Future perspective
These techniques of data analysis based on learning, as appearing in this article, is not yet widespread in welding process area. A bibliometric analysis among the authors studied in this research, presents a very little relationship between them. Figure 7 exposes this. The small dimensions of the authors' clouds (articles with welding process and new data analysis techniques) and their relationships (joint publications) show little maturity in the interrelation of these areas.
Some of the works demonstrate a small approximation between the areas, fulfilling the interdisciplinarity that Industry 4.0 advocates. Achieving this interdisciplinarity implies new study processes, defining new methodologies that unify the potential of these two areas. The needs of the modern world are going to make this happen in a short time. The new data analysis conception in welding processes area will be an acceleration in obtaining new and better models, more efficient predictions, and controls.

Conclusions
Several articles about the welding process were analyzed. These allowed to determine for each data mining stage how it is possible to optimize the results to obtain a good result of process analysis. Several analysis algorithms of the welding process were shown, and it was demonstrated that the comparison between them can make the process analysis more efficient and less expensive. The potential of learning-based techniques was described, because computational resources are becoming cheaper, and more quality information of welding process can be obtained. All these premises aligned with the so-called Industry 4.0, where a set of technologies that allow a fusion of physical and digital world, create a more intelligent and dynamic system.