Open access peer-reviewed chapter

The Application of System Identification and Advanced Process Control to Improve Fermentation Process of Baker’s Yeast

By Zeynep Yilmazer Hitit, Baran Ozyurt and Suna Ertunc

Submitted: December 13th 2016Reviewed: August 25th 2017Published: November 8th 2017

DOI: 10.5772/intechopen.70696

Downloaded: 530

Abstract

Fermentation process of Saccharomyces cerevisiae has been investigated by many researchers for higher product quality and yield with lower cost. Operating parameters such as pH, dissolved oxygen (DO) concentration, temperature, substrate type and concentration, agitation speed, air flow rate should be optimized to achieve valuable products. In this point, system identification and advanced control techniques emerge to provide solutions. Dynamic analysis of pH and DO of the growth medium were performed at aerobic conditions in a batch bioreactor by applying step and square wave inputs to the base and air flow rates, respectively. Input–output data of the process and linear Auto Regressive Moving Average with eXogenous (ARMAX)-type model were used to determine the relationship between controlled and manipulated variable in baker’s yeast production by system identification. The model parameters were estimated using the recursive least squares (RLS) method. The most suitable parametric model was determined by carrying out estimations with different values of initial value of the covariance matrix, forgetting factor, and order of the ARMAX model. Self-tuning generalized minimum variance (ST-GMV) control was performed with the ARMAX model for controlling pH and DO. Integrated square error (ISE) values were considered as a performance criteria for modeling and control studies.

Keywords

  • Baker’s yeast
  • S. cerevisiae
  • system identification
  • ARMAX model
  • RLS
  • ST-GMV control
  • dissolved oxygen control
  • pH control

1. Introduction

Saccharomyces cerevisiae microorganism, also known as baker’s yeast, is used in a variety of applications such as ethanol, glycerol, β-glucan, invertase enzyme, and mostly yeast production. Using molasses or glucose as the carbon source with batch or fed-batch operation, it is possible to produce ethanol and the yeast itself on commercial scale under anaerobic or aerobic conditions, respectively [1]. Due to the process economy, cells with high volumetric efficiency should be obtained at the growth phase of microorganisms [2]. S. cerevisiae research is continuing in biotechnology and genetics fields as well as R&D works in the food and pharmaceutical industry [3]. Many studies on β-glucan production emphasize the importance of S. cerevisiae production for use in the pharmaceutical field [4]. Production data must be examined well in order to better understand the process [3]. Microorganisms used as biocatalysts can be produced at high concentrations with high enzyme activity at suitable values of bioreactor operating conditions such as pH, temperature, dissolved oxygen (DO) concentration, air flow rate, agitation speed, and substrate concentration [2].

If the facts of bioprocessing are mentioned, the operating parameters of the systems change with time, and these changes are not linear. In addition to that, they lack the mathematical models that define the complex reactions, which take place during cell growth and product formation [5, 6]. Furthermore, there are limited online sensors that can detect state variables such as cell, substrate, and product concentrations. Also, inhibitory effects of the substrate, oxygen or formed product, on the activity of the biocatalyst are present. Alpbaz et al. emphasized that S. cerevisiae is very sensitive to changes in the growth environment [7]. Because of these constraints in bioprocesses, it is necessary and important to determine the optimal operating conditions and to control the operating parameters at the determined optimum values to ensure economic gain, high-quality product, and safe operation [8]. Despite installation, operation and modeling studies are present in the literature for batch and fed-batch operation, and it is very difficult to control them due to both their biological process nature and the dynamics throughout that process. [9].

The control performance can be affected by the controller-tuning parameters and the process model parameters besides, the choice of control algorithm and the structure of the process model. Parameters of parametric and nonparametric models are calculated using dynamic analyses of different disturbance effects and system identification algorithms. Parametric models are generally constructed using a discrete time polynomial model [10]. Step, square wave, pseudorandom binary sequence (PRBS), impulse, pulse, and random inputs are generally applied as input variables (such as flow rates of acid/base for pH, cooling/heating fluid flow rate for temperature, air flow rate for DO, and substrate feed for substrate concentration) to perform dynamic analyses and obtain input–output data from process [11]. Various system identification algorithms such as Biermann, Levenberg–Marquardt, genetic, least squares (LS), and recursive least squares (RLS) have been studied for the calculation of the model parameters [10, 11]. The parameters of nonparametric models are calculated using the corresponding curves, such as reaction curve and Bode diagrams [12, 13]. The actual part of the system can be modeled with approximated structure and estimated parameters of it [3]. During control, the closed-loop performance of the system is largely depend on incompatibility of the actual process with the model. Therefore, a particular model structure should include all the known information about operating conditions and approximate the system to a chosen degree. Also, it should be flexible and lead to fast parameter estimation procedures [14]. Auto Regressive Moving Average with eXogenous (ARMAX)-type input polynomial model has been widely used in the literature due to the basic structure to describe the process dynamics [2, 3, 1012, 1518].

The need for self-regulating controllers stems from the desire to control processes whose parameters are unknown or slowly changing over time [11]. The fundamentals of this method were based on the self-tuning regulator (STR) developed by Åström and Wittenmark [19]. The control objective of this method is to reduce the variance of the output variable (such as pH, temperature, DO, and substrate concentration) to a minimum. The STR predicts the future output variance and then tries to implement a control action that forces the estimated variance to be zero [11]. However, in the applications of the STR technique, some difficulties have been experienced such as lack of online tuning parameters, weakness for control of nonminimum phase systems and poor control on changing or unknown time-delayed systems [2]. Later, Clarke, and Gawthrop modified the STR to the self-tuning generalized minimum variance (ST-GMV) to overcome these difficulties [20]. ST-GMV is an adaptive algorithm based on GMV cost function and a predictive form of the process model. This formulation leads to an easier tuning [21]. ST-GMV method has become very popular nowadays and is widely used in the industrial applications [22, 23].

In this chapter, dynamic analysis and system identification of the most important operating parameters for the baker’s yeast production process which are pH and DO of the growth medium were investigated. Dynamic analysis were conducted at an optimal temperature for baker’s yeast production determined from previous studies as 32°C were done in order to explain the process behavior and obtain the data to be used in the system identification step. It is clear that this study must be performed as process specific due to the data depends on mostly the physical structure of the process. This chapter especially focused on the success of the system identification step on the control applications. For examining the controller performance dependence on the process model structure and the model parameters, two contrary models which were called as suitable and unsuitable models were used in the ST-GMV simulation studies. Controller performances were evaluated according to the constant set point trajectory with various noisy conditions for both controlled variables. Similarly, as most of the studies given in the literature, simulation study results conducted at the MATLAB environment and interpreted in the base of intergrated squared errors.

2. Effects of operating parameters on baker’s yeast production

S. cerevisiae yeast is undoubtedly one of the most important microorganisms that have been consumed safely throughout human history. Yeast cells need both nutrients and energy for growth and product formation. They use sugars such as glucose, maltose, and sucrose, vitamins such as biotin, pantothenate, inositol, and minerals such as Cu, Zn, Fe, Mo, and Mn to provide the necessary nutrients and energy [22]. High substrate concentrations may cause inhibition on microorganism production due to that substrate is binding to a second, nonactive site on a form of enzyme [22]. In the same case, ethanol production also occurs due to oxygen deficiency. This is known as the Crabtree effect, which is undesirable because it causes low yield in the fermentative growth of the cells [22]. Substrate level that inhibits the production of yeast is essentially dependent on the cell and substrate type. Glucose concentration over 200 g/L inhibits the microorganism growth in yeast production [2].

Another essential requirement is oxygen. Yeast cells use oxygen together with sugar to grow without ethanol production. As in the case of beer and wine production, if there is not enough oxygen in the environment, yeast will continue to grow by producing ethanol. In order to produce the yeast in the desired way, the oxygen and sugar transfer to the growth medium should be good and the ethanol formation should be low. Oxygen is a limiting substrate due to its lower solubility in water. The solubility of oxygen in water is 8 mg/L at 30°C [24]. It is known that the lower the DO concentration, the lower the substrate consumption and the rate of carbon dioxide formation, and this is called as Pasteur effect [25]. When working at aerobic conditions, it is not enough to feed only the oxygen source to the system. Providing a homogeneous distribution of oxygen in the liquid medium is also an important parameter [24]. The transfer of oxygen from the gas phase to the microorganism in the feed medium is of great importance in determining bioreactor design and operating conditions. Depending on whether the medium condition is aerobic or anaerobic, the following reactions occur during yeast production.

C6H12O6+6O2Aerobic conditions6CO2+6H2O+688kcalE1
C6H12O6Anaerobic conditions2C2H5OH+6CO2+56kcalE2

Carbon dioxide and water form in the medium when the yeast can grow in a suitable environment under aerobic conditions. If the goal is to produce yeast, aerobic conditions are required [12]. Otherwise, ethanol production occurs under anaerobic conditions [12]. Since ethanol itself is a carbon source, yeast cells can also use the ethanol to grow primarily during the diauxic phase [25]. Since ethanol is toxic, yeast prefers sugars to ethanol in order to grow. During growth, the yeast cell produces carbon dioxide. Another reliable variable that can be used to describe the state of the yeast since it does not use oxygen while producing ethanol is the ratio of the carbon dioxide production rate (CPR) to oxygen uptake rate (OUR). This ratio is defined as respiratory quotient (RQ) [2].

The pH of the growth medium is an important operating parameter because it affects the activity of enzymes in the microorganisms. Yeast is resistant to acidic environments, as well as being very sensitive to alkali environments. It is preferable for the growth medium to be between pH 3 and 6 [24]. It is difficult for cations and anions to pass through the cells when not working at the appropriate pH for the microorganism. Disruption of cell permeability affects enzyme activity and causes protein synthesis to stop. Besides, cells become more susceptible to toxic substances. For these reasons, the pH of the growth medium is influential on the substrate consumption and production yield of the yeast.

Like all microorganisms, yeasts have minimum (5–25°C), maximum (40–50°C), and optimum (30–40°C) growth temperatures [26]. The activity of enzymes involved in microorganisms’ structures is greatly affected by temperature. In general, the temperature of the growth medium has a great influence on the growth of microorganisms, respiration, and product formation. According to the chemical kinetics, rise of temperature increases the reaction rate; however, enzymatic reactions adhere to this rule until a certain temperature. The temperature increase caused by the heat generated during the production of the baker’s yeast under aerobic conditions is undesirable in terms of product efficiency.

3. System identification

Before working on a system, it is necessary to know the system’s upper and lower limits in order to be aware of the possible situations that can be faced. To control the system, firstly the relations between the input variables and the output variables must be obtained and the model of the system must be developed. While modeling the systems, it is possible to use input–output data obtained by experimental studies or mass-energy balances. However, it is difficult to obtain a model by using mass-energy balances in complex systems, and in some cases these balances may be insufficient to accurately identify the system. In such cases, it is more useful to create models using system identification methods from the experimental input–output data [27]. System identification might be described as a method based on giving a disturbance to the input variable and consequently obtaining the output variable data of the system and determining the model. The unknown parameters in the parametric model are found by an appropriate method using input–output data. Then, the model is compared with the experimental data and the fitness is tested.

3.1. Signals used in system identification

The first step of system identification is the selection of input signals that will affect the system. Step, square wave, sinusoidal wave, PRBS, impulse, pulse, and random signals are generally applied as inputs [11]. Often, discrete time models are used to describe the system [811]. For this purpose, input and output variable signals are sampled and recorded at a suitable time interval. This chapter was focused on step effect as input signal.

3.2. System models

Linear model of an open-loop discrete time system can be written in terms of u(t) as input variable and x(t) as output variable as shown in Eq. (3).

xt+a1xt1++anaxtna=b0ut1++bnbutnbE3

Backward shift operator is defined as Eq. (4).

zixt=xtiE4

Here, x(t) represents the x value at time t, x(t − 1) represents the x value at time (t − Δt) for Δt = 1, and x(t − i) represents the x value at time (t − iΔt). Eq. (5) is defined as a discrete time transfer function.

xt=BAutE5

Here, polynomials of A and B can be written as Eqs. (6) and (7).

A=1+a1z1++anaznaE6
B=b0+b1z1++bnbznbE7

Roots of polynomials A and B are poles and zeros of the system, respectively. If one of the poles or zeros of the system is placed outside the unit circle of the z-plane, system is defined as unstable or nonminimum phase, respectively [28]. In a self-tuning system, disturbance effects can occur in a wide variety of forms. The disturbance signal s(t) can be a part of the control system and is often treated as an additional disturbance factor at the output of the controlled process. In this case, the self-tuning controller will attempt to eliminate this disturbance effect. Such signals can be summed up in two groups as defined signals and random signals. In general, model of a constant random signal source is shown as follows.

st=CAetE8

Here, C polynomial is defined as Eq. (9).

C=1+c1z1+c2z2++cnczncE9

Eq. (8) is also described as auto regressive moving average (ARMA) model. Whole system output can be written in various forms as Eqs. (1012).

yt=xt+stE10
yt=BAut1+CAetE11
yt=a1yt1++anaytna=b0ut1++bnbutnb+et+c1et1++cncetncE12

This model is obtained by adding a control input to the ARMA-type signal, and the system is called ARMAX model.

3.3. Estimation of model parameters

3.3.1. LS method

Eq. (11) is written in matrix form, and transpose of data and parameter vectors are defined as (ϕT) and (θT), respectively.

ϕT=yt1ytnaut1utnbet1etncE13
θT=a1anab0bnb)c1cncE14

Using Eqs. (13) and (14), output variable can be written as follows.

yt=ϕTθ+etE15

Eq. (15) is known as linear in the parameter model. There are N measurement values, where N is the number of data samples. Eq. (15) can be redefined as follows.

y1=ϕT1θ+e1y2=ϕT2θ+e2..yN=ϕTNθ+eNE16

Eq. (16) can be written in vector form as follows.

Y=y1y2..yN,φ=ϕT1ϕT2..ϕTN,E=e1e2..eNY=φθ+EE17

The estimated parameter matrix (θ̂) is written as given in Eq. (18), with the output variable (θ̂) obtained from the model.

ŷt=ϕTtθ̂E18

The difference between the measured output variable and the output variable calculated from the model is defined as the estimation error (ε(t)) and given in Eq. (19).

εt=ytŷt=ϕTtθ̂E19

Eq. (19) is rearranged for N measurements, and Eq. (20) is obtained.

ε1=y1ŷ1=ϕT1θ̂ε2=y2ŷ2=ϕT2θ̂..εN=yNŷN=ϕTNθ̂E20

Eq. (20) can be written in vector form as follows.

εt=Yφθ̂E21

In order to calculate θ̂using LS method, cost function (J) given below in Eq. (22) must be minimized. ε value of Eq. (21) is written in Eq. (22) and must be rearranged as follows.

J=t=1Nε2t=εTε=ε2E22
J=Yφθ̂TYφθ̂=YTYφTθ̂TYYTφθ̂+φTφθ̂Tθ̂E23

To find the value that makes the Eq. (23) minimum, the derivative is taken, then equalized to zero, and rearranged and the predicted parameter vector calculated by the LS method is found as follows.

Jθ̂=2φTYφθ̂=0E24
θ̂=φTφ1φTYθ̂=t=1NϕtϕTt1t=1NϕtytE25

Although the LS is a widely used method, it is not suitable for self-tuning and predictive control methods because the parameter calculation is not made in real time. In such control methods, the data and parameters must be solved at every t instant and updated.

3.3.2. RLS method

Real-time parameter estimation is possible with RLS method and can be easily applied to self-tuning and predictive control algorithms for calculating time-varying model parameters. In the RLS method, the new value of the output variable is calculated by using the model parameters based on past data and the new input–output variable values. The actual value (y(t)) is compared with this estimated value, and the error (ε(t)) is found. The model parameters calculated in the previous step are updated with the newly calculated model parameters [27]. If Eq. (25) is written for any t instant, the parameter calculation equation will be as follows.

θ̂t=φTtφt1φTtYtE26

Description of terms in Eq. (26) is in terms of vector forms as follows.

Yt=y1y2..yt,φt=ϕT1ϕT2..ϕTtE27

If Eq. (26) is written for the next sampling time (t + 1), parameter calculation equation will be as follows.

θ̂t+1=φTt+1φt+11φTt+1Yt+1E28

Terms φ(t + 1) and Y(t + 1) of Eq. (28) are written in vector form as follows.

θ̂t+1=φtφTt+1,Yt+1=Ytyt+1E29

Using these equations, the terms in Eq. (28) are updated.

φTt+1φt+1=φTtϕt+1φtϕTt+1=φTtφt+ϕt+1ϕTt+1E30
φTt+1Yt+1=φTtϕt+1Ytyt+1=φTtYt+ϕt+1yt+1E31

Covariance matrix (P(t)) is defined and written in Eq. (30) as follows.

Pt=φTtφt1E32
Pt+11=Pt1+ϕt+1ϕTt+1E33

Using covariance matrix definition, Eqs. (26) and (28) can be rewritten as follows.

θ̂t=PtφTtYtE34
θ̂t+1=Pt+1φTt+1Yt+1E35

The term of φT(t + 1)Y(t + 1) in Eq. (31) is written in Eq. (35) and rearranged as follows.

θ̂t+1=Pt+1φTtYtϕt+1yt+1=Pt+1φTtYt+Pt+1ϕt+1yt+1E36

Eq. (37) is obtained using Eq. (34) as follows.

φTtYt=P1tθ̂tE37

Eq. (33) is combined with Eq. (37) and rearranged as follows.

φTtYt=Pt+1θ̂tϕt+1ϕTt+1θ̂tE38

Eq. (38) is written in place at φT(t)Y(t) term of Eq. (36).

θ̂t+1=Pt+1Pt+11θ̂tϕt+1ϕTt+1θ̂t+Pt+1ϕt+1yt+1=θ̂t+Pt+1ϕt+1yt+1ϕTt+1θ̂tE39

Estimation error at time (t + 1) is defined as Eq. (40), and model parameter vector at time (t + 1) is found as Eq. (41).

εt+1=yt+1ϕTt+1θ̂tE40
θ̂t+1=θ̂t+Pt+1ϕt+1εt+1E41

Matrix inversion is applied to Eq. (33), and future value of covariance matrix is obtained as Eq. (42).

Pt+1=PtPtϕt+1ϕTt+1Pt1+ϕTt+1Ptϕt+1E42

RLS method consists of Eqs. (4042), and the algorithm used is given below.

At time t + 1,

  1. Form ϕ(t + 1) vector using new data.

  2. Calculate ε(t + 1) using Eq. (40).

  3. Calculate P(t + 1) using Eq. (42).

  4. Update model parameter vector θ̂t+1using Eq. (41).

  5. Return to step 1.

4. ST-GMV control

Cost function of STR, defined as the difference between the set point and the measured value for an input–output model, is as follows.

Jut=Ξyt+krt+k2E43

where y is the output variable, r is the set point, u is the manipulated variable (input), Ξ is the expectation, and k is the default time delay [16]. It is possible to minimize this cost function by choosing u(t) which can be defined as an appropriate control output at time t. At the next sampling time step (t + Δt), a new situation occurs between y and r, and u will need to get a new value. If the default time delay is smaller than the time delay to be encountered in the real system, then the control output will try to remove the noise components before being transmitted to the system with the time delay in the real system. This would result in large feedback gains, resulting in an unrealizable controller that would make the system unstable. On the other hand, if the default time delay is greater than the time delay of the real system, then the lowest possible noise value will not be obtained since the highest rate for manipulation is not provided [16]. Clarke and Gawthrop set out the ST-GMV method using the control cost of the STR of Aström and Wittenmark to remove the difficulties in the STR altogether [29]. ST-GMV control is a one-step ahead optimal control strategy. The cost function of this technique is expressed by the following equation.

Jut=Ξyt+krt+k2+λut2E44

This type of controller design can internally stabilize the system, and the stability depends on the selected λ values. ST-GMV algorithm has a good set point tracking characteristic and has the ability to control nonminimum phase systems. If the default time delay is implemented within the generalized system, then the control signal compensates the pseudo output φ(t) accordingly and directs the feed-forward path.

Using Eq. (44), ST-GMV method relies on maintaining closed-loop stability by taking λ as small as possible while maintaining a minimum output change to stay reasonably close to the expectation. Cost function can be generally expressed as follows.

Jut=Ξφ2t+kE45

ST-GMV method uses a system pseudo output φ(t + k) given by the following equation to minimize the cost function expressed in general by Eq. (45).

φt+k=Pyt+k+QutRrtE46

Here, r(t) is the set point, P, Q, and R are the transfer functions with backward shift operator (z−k). Pseudo output of the system includes a feed-forward feed term (Q) and filters (P, R) of output and the set point. ST-GMV algorithm uses the feed-forward polynomial Q to prevent output noise removal problem before signal transmission. φ(t + k) term of Eq. (46) can be expressed using Eq. (11) with the implementation of default time delay as follows.

φt+k=PB+QAAut+PCAet+kRrtE47

According to this equation, cost function to be minimized given by Eq. (45) will be the pseudo output variation. ST-GMV control algorithm divides the system into parts. For this, firstly, the error term is fragmented to include past, current, and future data.

PCAet+k=Eet+k+zkGAetE48

Both sides of Eq. (48) are multiplied by A and rearranged as follows.

PC=AE+zkGE49

Polynomials are written as follows.

A=1+a1z1++anaznaE50
B=b0+b1z1++bnbznbE51
C=1+c1z1++cnczncE52
E=1+e1z1++ek1zk1E53
G=g0+g1z1++gngzngE54
P=1+p1z1++pnpznpE55

AE term of Eq. (49) is expressed in terms of ARMAX model including offset as follows.

Ayt+k=But+Cet+k+dE56

If both sides of Eq. (54) is multiplied by E and written in Eq. (49), Eq. (55) is obtained.

PCyt+k=BEut+CEet+k+Ed+GytE57

Both sides of Eq. (55) are divided to C, and Eq. (56) is obtained.

Pyt+k=BECut+Eet+k+EdC+GCytE58

Eq. (56) is combined with Eq. (46) and rearranged as follows.

φt+k=1CBE+QCut+GytCRrt+Ed+Eet+kE59

Eq. (57) is the sum of current and future terms. Current terms can be expressed as follows and represents the best φ(t + k) estimation made by using the data until time t.

φt+kt=1CBE+QCut+GytCRrt+EdE60

The second term is the estimation error caused by the noise source, e (t + 1), e (t + 2), …, e (t + k). The second term cannot be removed using control signal u(t) as mentioned before.

Eet+k=φt+k+φt+ktE61

So, J is minimized by equalizing Eq. (58) to zero.

φt+kt=0BE+QCut+GytCRrt+EdC=0E62

Eq. (60) is rearranged using following definitions.

F=BE+QCE63
H=CRE64

ST-GMV control law can be expressed as follows.

Fut+GytHrt+Ed=0E65

Calculation of input variable using ST-GMV control law is made using following equation.

ut=HrtGytEdFE66

Application of ST-GMV algorithm consists of following steps [16, 29]:

1) Apply a PRBS to the system as a forcing function and obtain the plant output.

2) Estimate F, G, H from Eq. (63), implementing the RLS algorithm.

3) Employ Eq. (64) to evaluate the control signal.

4) Apply the control signal.

5) The system output is obtained.

6) Return to step 1.

5. Material and methods

5.1. Microorganism, inoculum preculture, and growth medium

S. cerevisiae NRRL Y-567 was obtained from NRRL-Agricultural Research Service Culture Collection. Preculture and growth media consist of 2% glucose, 0.6% yeast extract, 0.3% K2HPO4, 0.335% (NH4)2SO4, 0.376% NaH2PO4, 0.052% MgSO4·7H2O, and 0.0017% CaCl2·4H2O which were sterilized by autoclaving under 1.2 atm at 121°C for 20 min. Microorganisms were incubated for 8 hours at 32°C at 120 rpm, and inoculum ratio of 1:10 was used for scale enlargement.

5.2. Experimental system

In order to observe the change of DO and pH over time during baker’s yeast production using a 2-L working volume of laboratory-scale bioreactor which was operated continuously, the input–output data were recorded and the ARMAX model parameters were determined by RLS method written in MATLAB. Experimental system is given in Figure 1. Most suitable parametric model was estimated using different values of α (covariance matrix), λ (forgetting factor), and order of parametric model. During experiments, DO and pH were measured with a WTW Oxi 340 with polarographic DO sensor and WTW pH340i pH meter, respectively. The DO and pH probes immersed in the bioreactor measure the online DO and pH values of the growth medium, and these values were converted to the electrical signal with DO, and pH meters reach the I/O card in the computer via the carrier interface modules. The signals arriving to card are interpreted by the algorithm written in Visual Basic in the ADVANTECH VISIDAQ package program and was sent to the system online. Operating conditions of the bioreactor was given in Table 1.

Figure 1.

Experimental system.

Temperature (°C)Air flow rate (vvm)Cooling water flow rate (mL/min)Cooling water temperature (°C)Agitation speed (rpm/min)
3215521600

Table 1.

Operating conditions of the bioreactor.

6. Results and discussion

6.1. Dynamic analysis

6.1.1. Step input given to air flow rate

In baker’s yeast production, the oxygen concentration must not fall below the critical value (0.7 mg/L); therefore, firstly manipulated variable must be selected to control the DO [12]. For that purpose, the air flow rate was chosen as the manipulated variable for the control of the DO. However, the effect of air flow rate on the DO in the bioreactor was investigated in order to observe effective control of this variable. For this purpose, while the system was in steady state at 1 mg/L DO for 0.5 L/min air flow rate, the positive step input was given to air flow rate as 3.4 L/min, and the change in DO over time was observed. In this case, DO was increased to 3 mg/L as can be seen from Figure 2.

Figure 2.

Positive step input given to air flow rate from 0.5 to 3.4 L/min.

6.1.2. Step input given to base flow rate

During the yeast growth, due to the degradation process of glucose in aerobic conditions to save the chemical energy in ATP molecules cause an increase in the concentration of H+ ions resulted with pH decrease. Decreasing the pH of the medium affects not only the cell division, but also the cleavage rate and the production of many products from yeast and the activity of enzymes. For this purpose, it is necessary to determine the manipulated variable in order to achieve pH control. Therefore, the base flow rate was selected as the manipulated variable for pH control. However, in order to observe effective control of this variable, the effect of the base flow rate on the pH in the bioreactor was investigated. The bioreactor was settled at pH 3.90 under the specified operating conditions, and then, microorganism was fed to the bioreactor. Positive step input was given to base flow rate from 0.26 to 1.41 mL/min with 0.05 M NaOH solution. The acid (H2SO4) flow rate was kept constant at 0.22 mL/min. The change in pH value over time under such an effect is shown in Figure 3.

Figure 3.

Positive step input given to base flow rate from 0.26 to 1.41 mL/min.

6.2. System identification results

6.2.1. Determination of model parameters for controlled variable of DO

In order to find the most appropriate ARMAX model using data of the manipulated variable air flow rate and the controlled variable DO obtained from dynamic analysis, the various forgetting factors (0.96–1), the initial value of the covariance matrix (1100,1000,10,000), and the order of the model (na = 2 nb = 1, na = 2 nb = 2, na = 3 nb = 1, na = 3 nb = 2) were run with the RLS algorithm, and integrated square error (ISE) was used for comparison. The models with the lowest and highest ISE values were used in the ST-GMV control algorithm to demonstrate the effect of the model structure on the control performance which will be explained in the next section. The compared values in terms of estimation performance are given in Table 2.

Estimation performance criteriaOrder of the model
na = 2, nb = 1na = 2, nb = 2na = 3, nb1na = 3, nb = 2
λPISEISEISEISE
0.9611.87201.2037e + 361.59371.9718e + 61
1001.62973.8648e + 051.42802.2419e + 34
10001.61109.9163e + 231.41248.5341e + 04
10,0001.60807.4019e + 251.40891.7541e + 25
0.9712.00451.95991.71801.6972
1001.75441.74411.54851.5402
10001.73371.73041.53061.5278
10,0001.73081.72951.52671.5254
0.9812.16022.11541.85701.8365
1001.90391.89201.68521.6752
10001.88031.87721.66411.6610
10,0001.87761.87651.65971.6585
0.9912.33442.28912.00551.9853
1002.07022.05591.83071.8186
10002.04272.03951.80521.8018
10,0002.04042.03921.80021.7991
112.44832.40602.11712.0993
1002.16632.14901.93841.9243
10002.13362.13041.90771.9040
10,0002.13172.13051.90201.9010

Table 2.

Estimated performance criteria as a result of positive step from 0.5 to 3.4 L/min for air flow rate and ISE values.

The estimation performance criterion λ value shows a decrease in ISE values between 0.96 and 0.97, but an increase in ISE values is observed after 0.97 of λ. The initial values of the covariance matrix between 1 and 10,000 resulted in a decrease in ISE values (Table 2).

Consequently, the most suitable ARMAX model was obtained with the order na = 3 nb = 2, the forgetting factor of 0.97, and the initial value of the covariance matrix of 1000. In the least successful ARMAX model case, the order was na = 2, nb = 1, the forgetting factor was 1, and the initial value of the covariance matrix was 1. At the end of this approach, it was decided that the type of ARMAX model to be developed for GMV control was given in Eq. (67).

yt0.1928yt1+0.0338yt20.0168yt3=0.1853ut1+0.5794ut2+et1E67

As a conclusion, the most suitable ARMAX model was obtained with the model order as na = 3, nb = 2, forgetting factor of 0.97, and initial value of the covariance matrix as 1000, and the RLS estimation of DO in the growth media by alterations with the air flow rate is given in Figure 4.

Figure 4.

Experimental system identification estimation of DO (na = 3 nb = 2, λ = 0.97, P = 1000).

6.2.2. Determination of model parameters for controlled variable of pH

In order to find the most appropriate ARMAX model using data of the manipulated variable base flow rate and the controlled variable pH obtained from dynamic analysis, the various forgetting factors (0.96–1), the initial value of the covariance matrix (1, 100, 1000, 10,000), and the order of the model (na = 2, nb = 1; na = 2, nb = 2; na = 3, nb = 1, na = 3, nb = 2) were run with the RLS algorithm and ISE values of the prediction were used for comparison. The models with the lowest and highest ISE values were used in the ST-GMV control algorithm, which will be explained in the next section. The compared values in terms of estimation performance are given in Table 3.

Estimation performance criteriaOrder of the model
na = 2, nb = 1na = 2, nb = 2na = 3, nb = 1na = 3, nb = 2
λPISEISEISEISE
0.9611.29541.29481.11361.2699
1001.23941.22781.07151.0658
10001.20961.20291.05211.0479
10,0001.19521.19431.04101.0383
0.9711.36441.36311.18581.1851
1001.31601.30621.14741.1423
10001.28151.27311.12601.1204
10,0001.26201.25951.11171.1091
0.9811.44391.44281.27061.2699
1001.40201.39401.23511.2312
10001.36701.35581.21491.2077
10,0001.34051.33731.19611.1929
0.9911.53021.52921.36761.3669
1001.49101.48521.33261.3298
10001.46021.44681.31591.3079
10,0001.42571.42131.29301.2887
111.62071.62021.48281.4820
1001.58271.57851.44841.4460
10001.55821.54411.43551.4276
10,0001.51711.51081.41041.4044

Table 3.

Estimated performance criteria as a result of positive step from 0.26 to 1.412 mL/min for base flow rate and ISE values.

ISE values were raised with the increase of the estimation performance criterion λ. For the same forgetting factor values, the ISE values decrease with the increase of covariance matrix initial value. The lowest ISE was obtained when the initial value of the covariance matrix was 10,000. As the order of polynomial A increases, the ISE values decrease, and as the order of polynomial B increases, a significant change in ISE values cannot be observed.

yt0.2235yt1+0.1077yt20.0737yt3=1.2723ut1+1.921ut2+et1E68

As a result, the most suitable ARMAX model was obtained with the order na = 3 nb = 2, the forgetting factor of 0.96, and the initial value of the covariance matrix of 10,000. In the least successful ARMAX model case, the order was na = 2, nb = 1, the forgetting factor was 1, and the initial value of covariance matrix was 1. At the end of this screening, it was decided that the suitable ARMAX model structure in order to develop the GMV control algorithm for pH control with by manipulating the base flow rate is given in Eq. (68).

As a conclusion, the most suitable ARMAX model was obtained with the model order of na = 3, nb = 2, forgetting factor as 0.96, and initial value of the covariance matrix of 10,000, and the RLS estimation is given in Figure 5.

Figure 5.

Experimental system identification estimation of pH (na = 3 nb = 2, λ = 0.96, P = 10,000).

6.3. ST-GMV control applications of baker’s yeast production

The suitable and unsuitable ARMAX models of the yeast production process expressed the relationship between the controlled variables of DO and pH, with the manipulated variables of air flow rate and base flow rate in system identification results section. After this step, ST-GMV control performances were evaluated with the suitable and unsuitable ARMAX models determined for each controlled variable in the case of constant set point trajectory for various noise levels. The control performance criterion was selected as ISE and values were evaluated for the ST-GMV control simulations of both DO and pH control cases. By this way, how much the system identification step, including the determination of model structure and model parameter settings, has affected the success of process control is demonstrated by using control simulations.

6.3.1. DO control

In the baker’s yeast production process, in which DO was controlled variable and the air flow rate was selected as the manipulated variable, the most suitable and unsuitable ARMAX models obtained from system identification have been used in ST-GMV control algorithm. In the case of positive step input from 0.5 to 3.4 L/min for the air flow rate, the order of the most suitable model was na = 3 nb = 2, λ = 0.97, and P = 1000. By the same way, the model that does not identify the system (largest ISE value) was found as na = 2 nb = 1, λ = 1, and P = 1. When the both suitable and unsuitable obtained models were used in the ST-GMV control algorithm with the controller parameters of P = 1, Q = 0.9975, R = 2.0885 in the presence of two different noises. It was observed that the suitable model with calculated ISE values of 50.14 and 52.37 was definitely able to identify the system as expected and provided a good control (Figure 6) in contrast to unsuitable model with ISE values of 502 and 609.56, respectively (Figure 7).

Figure 6.

ST-GMV control simulation results with the most appropriate ARMAX model at the desired constant DO set point (a) having 0.005 noises, (b) having 0.05 noises.

Figure 7.

ST-GMV control simulation results with unsuitable ARMAX model at the desired DO set point (a) having 0.005 noises, (b) having 0.05 noises.

6.3.2. pH Control

In the baker’s yeast production process, in which the pH value was controlled variable and the base flow rate was selected as the manipulated variable, the most suitable and unsuitable ARMAX models obtained from system identification have been used in the ST-GMV control algorithm. In the case of positive step input from 0.26 to 1.41 mL/min for the base flow rate, the order of the most suitable model was na = 3, nb = 2, λ = 0.96, and P = 10,000. By the same way, the model that does not identify the system (largest ISE value) was found as na = 2 nb = 1, λ = 1, and P = 1. When the both suitable and unsuitable obtained models were used in the ST-GMV control algorithm with the controller parameters of P = 1, Q = 0.9375, R = 1.1885 in the presence of two different noises. It was observed that the suitable model with calculated ISE values of 110.2 and 113.51 was definitely able to identify the system as expected and provided a good control (Figure 8) in contrast to unsuitable model with ISE values of 245.69 and 213.69, respectively (Figure 9). ST-GMV control simulation results are summarized in Table 4.

Figure 8.

ST-GMV control simulation results with the most appropriate ARMAX model at the desired constant pH set point (a) having 0.005 noises, (b) having 0.05 noises.

Figure 9.

ST-GMV control simulation results with unsuitable ARMAX model at the desired pH set point (a) having 0.005 noises, (b) having 0.05 noises.

Controlled variableNoisesControl with suitable model ISEControl with unsuitable model ISE
DOe = 0.00550.1361502.0056
e = 0.0552.3745609.5629
pHe = 0.005110.2026245.6958
e = 0.05113.5143213.6950

Table 4.

Theoretical ST-GMV control performance summary with suitable and unsuitable ARMAX models.

7. Conclusion

Understanding the dynamic behavior of biotechnological processes, in which living cells are used as biocatalysts, is one of the most challenging issues nowadays due to the fact that thousands of biochemical reactions are taking place simultaneously. It is clear that process operation in the batch mode will be difficult due to time-varying parameters. In this case, estimation procedure is gaining the main importance to express the real process behavior by the mathematical models. For this purpose, various methods and the approaches exist. Selecting the most appropriate method for the system identification is the next critical step. It also affects the success of the process dynamic behavior estimation. In the production process of baker’s yeast in a batch operational mode with aerobic conditions using S. cerevisiae microorganism, system identification studies carried out easily by RLS algorithm and were found successful for identifying DO and pH variations with the air flow rate and the base flow rate manipulations. Prediction error defined as the ISE demonstrates that the estimation performance was good.

Selection of the model structure is crucial in expressing process behavior accurately. In this study, order of the model was found as na = 3, nb = 2 for both polynomial-type ARMAX model structure by examining the different order of the models. As the order of the polynomial A increases, the difference between the actual value and the predicted value decreases, which is desirable. However, the increase in B polynomial does not show any significant change. The forgetting factor was found as 0.96 and 0.97, while the initial value of covariance matrix was not as effective as the value of the forgetting factor, and the 1000 value was observed as appropriate for all experiments.

The theoretical ST-GMV control of DO and pH was successfully performed with the most suitable ARMAX models obtained from system identification. When the noise level is increased in the theoretical ST-GMV control, it is possible to achieve successful control under the constant set point condition with obtained models. In addition, the performance of a controller that uses unsuitable models decreases with the increase of noise levels. So as a conclusion, successful control can only be accomplished with a good system identification.

© 2017 The Author(s). Licensee IntechOpen. This chapter is distributed under the terms of the Creative Commons Attribution 3.0 License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

How to cite and reference

Link to this chapter Copy to clipboard

Cite this chapter Copy to clipboard

Zeynep Yilmazer Hitit, Baran Ozyurt and Suna Ertunc (November 8th 2017). The Application of System Identification and Advanced Process Control to Improve Fermentation Process of Baker’s Yeast, Yeast - Industrial Applications, Antonio Morata and Iris Loira, IntechOpen, DOI: 10.5772/intechopen.70696. Available from:

chapter statistics

530total chapter downloads

More statistics for editors and authors

Login to your personal dashboard for more detailed statistics on your publications.

Access personal reporting

Related Content

This Book

Next chapter

Probiotic Yeast: Mode of Action and Its Effects on Ruminant Nutrition

By Shakira Ghazanfar, Nauman Khalid, Iftikhar Ahmed and Muhammad Imran

Related Book

First chapter

Grapevine Biotechnology: Molecular Approaches Underlying Abiotic and Biotic Stress Responses

By Grace Armijo, Carmen Espinoza, Rodrigo Loyola, Franko Restovic, Claudia Santibáñez, Rudolf Schlechter, Mario Agurto and Patricio Arce-Johnson

We are IntechOpen, the world's leading publisher of Open Access books. Built by scientists, for scientists. Our readership spans scientists, professors, researchers, librarians, and students, as well as business professionals. We share our knowledge and peer-reveiwed research papers with libraries, scientific and engineering societies, and also work with corporate R&D departments and government entities.

More About Us