A Practical Fitting Method Involving a Trade-Off Decision in the Parametrization Procedure of a Thermodynamic Model and Its Repercussion on Distillation Processes

The design of processes containing information on phase equilibria must be carried out through a series of steps, experimentation $ verification $ modeling $ simulation. Each of these steps should be rigorously performed to guarantee a good representation of the behavior of the system under study, whose adequate modeling could be used to simulate the corresponding process. To carry out the different previous tasks, two representative systems, extracted from known database, are used. The quality checking of experimental data series is certified through several thermodynamic consistency methods. The modeling is done by applying a multi-objective optimization procedure, which allows to define a solution front ( Pareto front ) for different sub-models that are established in this work. The fitness of trade-off solutions, obtained from the efficient front , on the design of distillation processes is analyzed through a simulation.


Introduction
Nowadays, process simulation [1] plays an important role in the chemical engineering field as an indispensable tool to gain precise knowledge about the process units. The use of powerful mathematical-computational tools allows to accomplish an optimal final design of chemical processes. An important matter is to ensure that the mathematical model correctly represents the quantities that reflect the state of the studied system. Then, what is a model? A model is a mathematical relationship that links the state variables of a system, such as temperature, pressure, or compositions, influencing the process performance. Therefore, the accuracy of the selected model is essential and greatly affects the final results of the simulation and design processes. Two milestones should be considered.
The first one is the selection of the model, built-in with the mathematical relationships that best represent the variables of the system. There is no a priori a procedure to make this choice, so heuristic or experience-based criteria are generally used [2]. The second question refers to get the best parameters that complete the definition of the model for a given data series. For this latter case, there are several numerical procedures [3,4] that allow to address the problem to optimize the parameter set considering the starting hypothesis.
The thermodynamic properties having the greatest influence on the simulation of separation processes are those related to phase equilibria (vapor-liquid equilibrium, VLE, in the scope of this work), as well as other thermodynamic quantities that arise in the mixing process. These properties are associated with the excess Gibbs energy g E , which is written as g E = g E (x i ,p,T). As such, the goal of the modeling is to achieve a functional type of f = f(θ,x i ,p,T) that minimizes the norm | g E À f |. The vector θ represents a set of parameters in the model, which must be optimized. However, due to the existing relation between phase and mixing properties, former approach may not be enough. In the first place, g E values can only be obtained from VLE, which satisfies a dependency of the type, F(x i ,p,T) = 0, preventing us from obtaining individual relations of g E with each one of the variables. On the other hand, defects in the experimental data could give place to incoherencies between experimental activity coefficients γ i = γ i (x i ,y i ,p,T) and the excess Gibbs function g E . Thus, the VLE fitting process becomes a bi-objective optimization problem; hence, two error functions are included in the correlation problem. The complexity grows as other properties, such as h E = h E (x,p,T), are included in the modeling, since new error functions should also be minimized. However, one of the benefits of this approach is to use a unique thermodynamic model that avoids the issues caused by possible discontinuities or inconsistencies between partial models of some properties. Even more relevant is that this allows the model to describe better the physics of the system under study. This supposes a mean to verify the coherence of the mathematical formalism imposed by thermodynamics, validating the different properties. A drawback of this practice is the increase of the complexity of the procedure, but this is just a numerical issue of relative complexity. Therefore, addressing the global modeling of the thermodynamic behavior of solutions as a problem with multiple objectives is a notable contribution in chemical engineering.
It is known that the resolution of multi-objective problems does not produce a single result; on the contrary, a set of non-dominated results that constitute the socalled Pareto front [5] is obtained. There is no precise mathematical criterion that allows the selection of a unique result. However, the process simulation requires a single result to define the intended design, having to resort to external criteria different from those used to obtain the front.
This study evaluates the effect of choosing the different results from the Pareto front in the simulation task. To achieve this, some partial goals are proposed such as (a) to establish a rigorous methodology to carry out the optimization procedure with the suggested modeling and (b) to check the real impact of the chosen model on the simulation, with the purpose of proposing a selection criterion. Thus, the designed methodology should include different stages, like the data selection used in the procedure, obtaining the result front and the election of the final result. Two systems, considered as standard in many studies on thermodynamic behavior of solutions, are selected since the necessary experimental information (VLE and h E ) is available in literature. After checking the data sets [6,7], making up the testing $ modeling steps, and having obtained the corresponding result fronts, each candidate model is analyzed on its suitability for process simulation.

Modeling procedures 2.1 Thermodynamic model
The correlation of the iso-p {VLE + h E } data for each one of the two systems selected is carried out using a parametric equation already used by us [8], which is applied to the excess Gibbs function of a binary system: and as described in [8]: where k g 2-1 is a fitting parameter. In this study, several forms of Eq. (2) are tested, by adapting it according to the availability of experimental data. Thus, different "sub-models" are defined by neglecting terms in Eq. (2) which give rise to the following four cases: From Eq. (1), the expression that characterizes the activity coefficients is Likewise, the excess enthalpies h E are calculated considering.
Eqs. (1) and (5) assume that the different sub-models established by Eq. (3) impose certain behavior hypothesis in the mixture. Thus, sub-model M1 implies that g E = h E . For the remaining assumptions, g E 6 ¼ h E , allowing different functions to express the h E s.

Calculation of the efficient fronts
To obtain the optimal parameters of the different sub-models that represent the data set iso-p VLE and h E , the multi-objective optimization (MOO) procedure is used, which is characterized by the vector where s(y E ) is the root of the mean square error (or RMSE) calculated by the model when representing the generic property y E : The result of such a problem is a discrete set of values, taken from the efficient front, s(g E /RT) = f[s(h E )], where each one of these give rise to different estimates of the thermodynamic behavior. Here the ε-constraint algorithm [3] is used to solve the MOO optimization problem. This procedure converts the multi-objective nonlinear problem (MNLP) into multiple single-objective problem (constrained nonlinear problem (CNLP)), thence minimizing only the first objective in Eq. (6), that is, using the other vector element to establish a constraint in the calculation, that is, The value of ε is limited, from 0 to 500 J mol À1 , since greater errors in the h E are not acceptable. The efficient front is then achieved by solving the CNLP (Eqs. (8) and (9)) for different values of ε in the indicated range. To do so, we used a hybrid evolutionary algorithm [3], coupled with Nelder-Mead algorithm [9], for local refinement.
For use we refer to data series with information for the two phases. This is a requirement to check the thermodynamic consistency. Figure 1(b) shows an azeotrope at high pressures, which moves from the acetone-rich region toward pure ethanol as pressure increases; the separation between the compositions in the two phases is reduced. Figure 1(c) and (d) contains, respectively, the variation of γ i and g E /RT with the liquid composition for the system at 101 kPa. Ethanol's activity coefficient shows some errors that surely affect the global consistency of the data series. High slopes are observed as x 1 ! 0 along with negative values of the natural logarithm of γ 1 for data series other than those referenced [14][15][16]. This is a clear sign of inconsistency. Table 1 shows the results obtained in the application of different consistency methods. The observation of the said table produces some comments which are interesting in the work development. The Wisniak test and the direct van Ness test accept most of the experimental series. The first of the methods rejects two (noted as n°10,12) while the second fails to validate only one, n°8. The Kojima test rejects five data series, with the error observed in the system n°12 being critical. A new methodology recently proposed by us [6,7] was also applied, which rejects nine systems: three by the integral form (n°9, 11,12) and six by the differential form (n°1, 3,4,8,9,12). These systems present obvious signs of inconsistency that are also observed by at least one of the other methods. Therefore, the overall assessment of these series (in other words, the quality of their experimental data) is not positive, ruling out the use of these series for other subsequent tasks, such as modeling and, specially, simulation.
Available iso-T data series are represented in Figure 5. Data are found for the whole composition interval only at 333 and 345 K. Several series were measured at 333 K [40][41][42], allowing their comparison. From the representation of p,x,y ( Figure 5(a)) and related mixing quantities ( Figure 5(b)), an acceptable coincidence is observed. Two of the data sets (Ref. [41]) show negative values of g E /RT in the extreme points, at infinite dilution, which is a clear sign of inconsistency.
the highest pressures (n°13-16). This is due to the wrong relationship between the pure component saturation temperature (Appendix Table A1) and equilibrium temperature. However, it is interesting to recognize that when using other vapor pressure parameters this defect can be solved. The data series (n°3, 4, 6-9, 19, 21-23) accepted by the proposed test [7] offer the highest consensus among all the battery of test about the quality of data, so these series could be used for subsequent tasks.
A total of 17 references were found in literature [36,[43][44][45][46][47][48][49][50][51][52][53][54][55][56][57][58] of h E measured in the range of [293-323] K and atmospheric pressure, except those data of Yi et al. [58], at 80 kPa. Figure 6(a) presents the data series considered in this study. It refers to a solution with endothermic effects (h E ≈ 900 J mol À1 at x = 0.5 and T = 298 K), decreasing as temperature increases; however, the high observed dispersion of experimental values is not sufficient evidence to ensure this effect. Inspection of the h E does not recommend excluding any of these data series; thus they will be considered in the data treatment.     Values obtained in the application of the thermodynamic consistency-test to VLE data of benzene + hexane at different conditions.

Modeling: efficient front and models
The VLE data series previously verified has been modeled according to the procedure described in Section 2. The efficient fronts obtained for each system are shown in Figure 7. Ref. [44], (+) Ref. [45], (◊) Ref. [46], (Â) Ref. [47], (△) Ref. [48], (▽) Ref. [36], (▷) Ref. [  Acetone(1) + ethanol (2): the efficient fronts of results for the binary acetone + ethanol were obtained from sub-models M2, M3, and M4. Sub-model M1 did not produce an acceptable representation of the thermodynamic behavior, so it was ignored. The obtained fronts shown in Figure 7 reveal that the three submodels produced similar results when acceptable error in h E is greater than 140 J mol À1 . This implies that, from this limit on, the problem becomes monoobjective, regardless of the sub-model used. For smaller errors of the h E , s(h E ) < 140 J mol À1 , differences between sub-models M2, M3, and M4 to reproduce the VLE data are significant, reducing the maximum error by approximately δs (g E /RT) ≈ 0.15 between them. The efficient front achieved with model M4 shows an almost constant s(g E /RT). The other two sub-models reveal an exponential behavior as s(h E ) decreases. From the set of results obtained, three of them are chosen (see Figure 7) to carry out a more detailed analysis on their ability to describe simultaneously the VLE and h E . Some particular comments on those three results are: • The result labeled as "1" (by M4) describes well the two properties under study, as seen in Figure 8, but at the expense of using more parameters. Specifically, this model links the state variables at equilibrium (Figure 8(a)), describing the observed folding at the lowest temperatures. The estimate of activity coefficients (Figure 8(b)) is acceptable. Estimations of g E /RT for this system are close to the set that displays the highest values. The estimate for the h E is acceptable, although the model does not reproduce the data series extracted from Ref. [20].
• Result 2 (M3), which displays the highest error in the VLE estimate, wrongly estimates an azeotrope and a whole range immiscible region (Figure 8(a)). This poor description of the liquid phase is due to the high values of γ i and the g E /RT (Figure 8(b and c)). At first, the h E is correctly represented by this parametrization, although the presence of the immiscibility truncates the validity region of the model (Figure 8(d)).
• Result 3 (M3) produces an unsatisfactory representation in the h E s, which also presents an inversion of the thermal gradient of this property.
Final model selection should ensure that selected parametrization describes, at least qualitatively, the thermodynamic behavior closest to the real one. In this case, result 2 reproduces almost exactly the h E , although it produces an incorrect g E behavior, hence limiting its subsequent applicability. The choice between results 1 and 3 will depend on the influence of the h E on the calculations in which the model is involved. This analysis will serve as a basis to inspect the results of the remaining binary (benzene + hexane) on the results emitted by each of the models.
Benzene(1) + hexane(2): sub-model M4 was not used for the binary benzene + hexane because of the similarity with the efficient front produced by M3 model. Thus, sub-models M1, M2, and M3 were only applied, as shown in Figure 9. Sub-models M2 and M3 produce very similar results, with almost constant error, s(g E /RT), lower than 5.5 Â 10 À3 J mol À1 for all s(h E ). Consequently, those result fronts of sub-models M1 and M2 are evaluated. The efficient front of sub-model M1 produces a quasi-linear behavior with the variation of s(h E ). The difference between this front and that of sub-model M2 is δs(g E /RT) ≈ 0.14, when ε = 0. The fronts for models M1 and M2 do not intersect, unlike the earlier case, showing a maximum value of s(h E ) at 170 J mol À1 and 450 J mol À1 , respectively. The VLE diagrams produced by the four selected results are shown in Figure 10, along with one of the validated data series for this result. The best description of this system is achieved with result 4 (M1), which reproduces the behavior of T-x-y experimental data (Figure 10(a)) and the other quantities calculated (Figure 10(b and c)). Nevertheless, the description of h E with this model is not good. Result 1 (M2) produces an azeotrope at x 1 < 0.2, which does not occur experimentally. This poor estimation occurs even though the greater number of parameters, increasing the model's capacity to reproduce the h E , as proven in Figure 11. Results 2 and 3 overestimate γ i , hence g E /RT (see Figure 10(b and c)).
This discrepancy gives rise to the formation of minimum boiling point azeotropes, which are not in accordance with experimental data. Of all the results chosen, only result 1 (belonging to sub-model M2) shows a h E that varies significantly with temperature, since sub-model M1 is independent of this variable. The use of either result 2 or 3 is discouraged since their estimations of h E , especially at temperatures other than 298 K, are not correct, in addition to the described issues in Figure 10. Plot of VLE at 101 kPa estimates for binary benzene(1) + hexane (2). Models drawn from Figure 9. representing the phase equilibria. This being said, result 3 represents the average behavior of this property in the working range. From previous observations, result 4 (M1) is recommended for those cases where the h E plays a secondary role in comparison to the reliable reproduction of the VLE diagram. Otherwise, result 1 (M2) is preferred, despite the qualitative misfit to experimental data. Table 3 presents an overview of the present section with a summary of the models to be used in the simulation task.

Simulation results of a rectification process for each of the studied dissolutions
The models obtained previously are used in the design of a simulation operation comparing their capacity in terms of some operation variables such as composition and temperature profiles, as well as energy consumption. General conditions for the simulations are summarized in Table 4. In all cases, columns are fed with a 1 kmol/h at equimolar composition of the corresponding solution, at 298.15 K and 101.32 kPa. Simulations are performed using the RadFrac block of AspenPlus © V8.8 (AspenOne © , [59]). Acetone + ethanol û ü üü üü Benzene + hexane üü üü ü û üü ! used for modeling and simulation;ü ! used for modeling; û ! not used. Table 3.
List of sub-models applied to each system. Benzene + hexane 10 0.6 30 20 Table 4. Operation data for the rectification columns to separate the binaries.

Binary system
Acetone(1) + ethanol (2): the simulation of a rectification column to purify the acetone+ethanol binary system, using the result labeled as "2" in Figure 7, does not provide a coherent resolution because it estimates the presence of two immiscible liquid phases. The final values obtained with results 1 and 3 are detailed in Table 5, while the composition profiles are shown in Figure 12. In both cases the composition of the distillate is higher than 99% in acetone and at the same temperature in the stage.
The exact purity is slightly higher with result 1 than with result 3, the difference being 0.2%. The calculation in the bottoms of the tower reveals differences between the two parametrizations, giving place to an effluent somewhat purer in the case of result 1. The difference between both models is 0.001 in molar fraction. These observations directly affect the calculation of the energy balance and, therefore, to the consumed energy. Thus, the consumption in the condenser is estimated similarly with both parametrizations, while that of the reboiler is significantly higher with result 3, due to the greater quantity of ethanol and the incorrect estimate of other quantities, such as the mixing enthalpies.

Stage
Result 1 (M4) Result 3 (M3)  Table 5. Quantities obtained in the simulation of a separation process for the binary acetone(1) + ethanol(2), using different values from the efficient front shown in Figure 7. The composition profiles and temperature gradient of the two tested solutions present similar qualitative behaviors. Most of the column is used as an enriching region which requires a high number of stages in both cases.
Benzene(1) + hexane (2): values obtained in the simulation of the distillation operation of this binary dissolution are in Table 6, while composition and temperature profiles are plotted in Figure 13. For the first three results (1, 2, and 3), the presence of an azeotrope limits the distillate composition. Thus, result 1 produces an effluent with a benzene composition of 16.7% (v/v), while for results 2 and 3, the compositions are, respectively, 37.2 and 30.9%. The simulation carried out with the parametrization of result 4 produces a purer distillate of 15.6%, due to the absence of the azeotrope. However, the folding effect observed between the equilibrium curves in experimental data as well as the diagram estimated by result 4 complicates the separation beyond this point. This justifies that the composition profiles are similar to results 1 and 4 (see Figure 13).
The residual streams obtained by results 1 and 4 contain benzene with a purity higher than 99.9%, while the other two results do not produce the separation of the  Table 6. Quantities obtained in the simulation of a distillation process for the binary benzene(1) + hexane(2), using different values from the efficient front shown in Figure 9. binary dissolution with high purities (no more than 79%), due to the presence of the azeotrope at intermediate composition. The differences noted in compositions in the head and bottom of the column cause that the corresponding temperatures are also different. Between results 1 and 4, there is a difference in temperature of almost 0.5 K, while results 2 and 3 estimate lower temperatures by more than 1 K with respect to the others. In this case, the modeling errors in another of properties, such as excess enthalpies, as well as the differences in the output composition, also affects to produce noteworthy differences in the energetic consumptions of both condenser and boiler. In this sense, similar results are obtained for results 1 and 4, although, a priori, it is not possible to indicate which of these is closer to the true behavior of the column.

Conclusions
The aim of this work is to present the practice carried out on a set of operations constituting a procedure involving the following sequences: experimentation $ verification $ modeling $ simulation. A description of the last three operations is proposed since the necessary experimental information (iso-p and iso-T VLE data and h E ) was extracted from the available publications. The proposed methodology is applied to two significant binary systems in the dissolution thermodynamic field, such as acetone-ethanol and benzene-hexane. The data checking is carried out with different classical methods, which is recommended in the recent literature [6]. In addition, a rigorous method recently designed by the authors [7] was also used in order to guarantee the quality of the experimental data used. This method has the advantage of offering some information about the origin of deficiencies in the experimental data.
A polynomial expression was used in the modeling step (see Eqs. (1) and (3)), used in the correlation of thermodynamic properties, from which four sub-models (M1-M4) are established in Section 2.1, depending on the availability of data for each system to avoid overfitting. Modeling in all cases was performed on the Gibbs dimensionless function, g E /RT (VLE), and on h E data, two-objective optimization approach addressed as a sequence of mono-objective subproblems according to the ε-constraint algorithm. A set of models are selected from the attained efficient fronts to provide a rationale on the trade-off decision task that is supposed to yield the final model for later uses, such as design/simulation. For the two studied cases, these fronts reveal a quasi-linear trend (with a negative slope) when the number of parameters used in the model is small. Efficient fronts'slope decreases as the number of parameters increases, until the error of VLE delinks to that of the h E for highly flexible models.
The rectification of the selected systems was simulated using the RadFrac block of AspenPlus © with the selected thermodynamic parametrizations. For the binary acetone + ethanol, some models showed an immiscible region, not reflected by the experimental information, giving rise to incoherent simulations. Two models gave rise to inexistent azeotropes in the benzene + hexane dissolution, with incorrect results in the simulation. On the other hand, errors in the excess enthalpy estimates did not influence on the procedure simulation, but it should be noted that it has an important role in the modeling of binary systems. In summary, the influence of a certain set of parameters on the simulation varies depending on each particular system. Besides, it is observed that, as the number of parameters grows in a model, the optimization problem mutes toward a mono-objective one since the considered criteria are invariant in one another. In these cases, taking the set of parameters that present the lowest error on h E is the best option. However, increasing the number of parameters might lead to overfitting if not enough attention is paid to the model extrapolation capabilities.
© 2019 The Author(s). Licensee IntechOpen. This chapter is distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/ by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.