Empirical equations for estimation of longitudinal dispersion coefficient.

## 1. Introduction

In recent years, preservation and purgation of rivers is considered by national and international organizations that have responsible of quality control and preservation of water resources. Because of providing public health, it is most important and vital in regions that cities that rivers feed drink waters and large industrial factories located near these rivers (Li et al., 1998; Pourabadei & Kashefipur, 2007; Tayfour & Singh, 2005). So it is clear that estimation and simulation of flow, contaminant and sediment transport in river and water systems have more significance in water resources management. Using precious estimations reduces the risk of contaminant and pollutants on environment in now and future and increases the impact and effectiveness of environmental engineering projects on water recourses quality (Li et al., 1998 ;).

The increasing process of pollution on surface waters necessities the requirement of using mixing and attenuating processes in natural rivers. One of the most important, proper and prosperous methods of river environmental management is using and improving of river self-cleaning ability. Now sinking of several types of agricultural and industrial Remainders into natural rivers to oxidize and elimination of organic materials is a usual management operation in environmental engineering. To control the quality of surface water resources, the sinking of pollutants into natural rivers and open flows should be done under a precious and logical method. This action requires the detailed knowledge and information on pollutant transfer in rivers and the ability of transporting, mixing and self-cleaning of pollutants by river flow (Pourabadei & Kashefipur, 2007 ;).

Contaminants and effluents undergo stages of mixing with flow and dispersed longitudinally, transversely and vertically by advection and dispersion transport processes. Contaminants and effluents due to advective and dispersive processes of river flows, propagates in longitudinal, transversal and vertical directions (Tayfour & Singh, 2005). Ability and power of river and other open channel flows in dispersing additive materials in longitudinal, transverse and vertical directions addressed and described by dispersion coefficients. Three dispersion coefficients K_{X}, K_{Y} and K_{Z} show the dispersion coefficients in longitudinal, transverse and vertical directions respectively (Tayfour & Singh, 2005). Far from the point of injection of pollutants to river where the mixing process is completed over all the cross section, only longitudinal dispersion is dominant and all of the dispersion phenomena are described by K_{X} coefficient (Chatila, 1997). Rate of longitudinal dispersion is determined by the longitudinal dispersion coefficient and finally the fate of contaminant transport is relevant to the longitudinal mixing and modeling, hazard zoning, monitoring and accurate determination of pollutant conditions in river and natural channels requires the precious estimations for longitudinal dispersion coefficient (Li et al., 1998; Fisher et al., 1979).

## 2. Important

Accurate estimation of longitudinal dispersion coefficient is required in several applied hydraulic problems such as: river engineering, environmental engineering, intake designs, estuaries problems and risk assessment of injection of hazardous pollutant and contaminants into river flows (Sedighnezhad et al., 2007; Seo & Bake, 2002). Investigation of quality condition of natural rivers by 1-D mathematical models requires the best estimations for longitudinal dispersion coefficient (Fisher et al., 1979). When measurements and real data of mixing processes in river are available, the longitudinal dispersion coefficient is determined simply, but in rivers that the mixing and dispersing data isn’t available and these phenomena aren’t known, should use alternative methods for estimation of dispersion coefficient values (Kashefipur & Falconer, 2002). In these cases, because of the complexity of mixing phenomena in natural rivers, the best estimations of dispersion coefficients aren’t possible and usually these values are determined by several simple regressive equations (Deong et al., 2001). There are several empirical equations for estimation of longitudinal dispersion coefficient in natural rivers that have presented in next sections (Seo & Cheong, 1998). These equations are valid only in their calibrated ranges of flow and geometry conditions and for larger or smaller ranges haven’t good results.

The main aim of this chapter is to investigate the method and equations that developed for dispersion coefficient estimation and assessing the accuracy of these methods in comparisons with real data and at least not at end, developing a new and accurate methodology for dispersion coefficient determination. So, In the first step authors have investigated previous studies and in the second step inventionally using adaptive neuro-fuzzy inference system(ANFIS), a new procedure is developed for accurate estimation of longitudinal dispersion coefficients and the results of this new model is compared with previous empirical equations. At follows, firstly we have presented most important equations for longitudinal dispersion coefficient and finally adaptive neuro-fuzzy inference system is described in detail. At the end of the chapter comparison of results of empirical relations with ANFIS model is presented.

## 3. Materials and methods

In this section, at first theoretical concepts, research background and most important equations that are available for estimations of longitudinal dispersion coefficient are presented and after that adaptive neuro-fuzzy inference system and developing algorithm of this model are presented. Also, the data set that have used in this study and variable ranges of these parameters are presented.

### 3.1. Theoretical background

The one-dimensional (1D) Fickian-type dispersion equation, which is derived by Taylor (Fisher et al., 1979), has been widely used to obtain reasonable estimates of the rate of longitudinal dispersion. The 1D dispersion equations is

Where C: concentration average in section, u: longitudinal average velocity, t: time, x: longitudinal direction in flow stream and K_{x}: Longitudinal dispersion coefficient. Based on this equation, the fate of pollutant transport in rivers is determined by K_{x} value. Fisher (Fisher et al., 1979) developed following triple integral term for estimation of it (Tavakollizadeh & Kashefipur, 2007)

Where K_{x}: Longitudinal dispersion coefficient, A: cross section area of flow, B: top width of water surface, h: local depth of flow in any transverse point, u’: deviation of depth average flow velocity from cross sectional average velocity, y: transverse location from left bank and _{t}: transverse mixing coefficient. In this equation the unknown term of et is described by several researchers as a transverse turbulent coefficient. It is noticeable that the equation 2 is a basic for several proposed empirical equations of K_{X}. Fisher et al. (Fisher et al., 1979) used following equation for estimation of _{t}in wide and straight rivers with uniform flow and constant depth in transverse which haven’t any transversal dispersions

Where H: depth average of flow in cross section, u_{*}: shear velocity and equals to √ (gHSf) and S_{f} is the longitudinal slope of energy.

Comparisons of real measurements with results of equation 2 shows that in uniform flows average error of this equation is 30% and in non-uniform flows it is reaches to 4 times of real data(FaghforMaghrebi & Givehchi, 2007). It is difficult to use equation 2 in real and applied cases because the geometry of cross section h(y) and transverse velocity profile v(h) aren’t available and can’t be determined simply and because its impracticalities Fisher et al. (Fisher et al., 1979) using several simple non-dimensional parameters proposed another equation (Tayfour & Singh, 2005)

In this relation B_{1}: longitudinal scale corresponding with shear resulted from transverse velocity distribution, _{t}: cross sectional average of transverse mixing coefficient and I is non-dimensional integral

Where its non-dimensional parameters are:

In this equation √(u’2) is the deviation of velocity and shows size of deviation of average turbulent velocity from cross sectional average velocity (Tayfour & Singh, 2005). Based on the proposed method by fisher and equation 4, researchers have developed several empirical relations which the most important of them are presented in table 1. it is clear that all of these equations determines the longitudinal dispersion coefficient using variables that relates the average conditions of river flow to the longitudinal dispersion processes. These variables are average depth of flow in cross section, average velocity and shear velocity and width of water surface. In this study presented equations in table 1 is compared and the accuracy of them is determined based on real data collected from published data sets. Also input and output parameters of ANFIS model are these variables.

Author(year) | Equation | Eq. No. | Ref. |

Elder(1959) | (7) | ( Tayfour and Singh, 2005 ) | |

Quien and quifer(1979) | (8) | ( Deong et al., 2001 ) | |

Fisher(1976) | (9) | ( Fisher et al., 1979 ) | |

Liu and Chen(1980) | (10) | ( Seo and Bake, 2002 ) | |

Liu(1980) | (11) | ( Seo and Bake, 2002 ) | |

Awasa and (1991) | (12) | ( Tavakollizadeh and Kashefipur, 2007 ) | |

Seo and chang(1998) | (13) | ( Seo and Cheong, 1998 ) | |

Kasiez and Rodriguez(1998) | (14) | ( Sedighnezhad et al., 2007 ) | |

Huang and li(1999) | (15) | ( FaghforMaghrebi and Givehchi, 2007 ) | |

Deong et al.(2001) | (16) | ( Deong et al., 2001 ) | |

Kashefipur and falconer(2002) | (17) | ( Kashefipur and Falconer, 2002 ) | |

Tavakolizadeh(2007) | (18) | ( Tavakollizadeh and Kashefipur, 2007 ) |

Tayfour and Singh (Tayfour & Singh, 2005) based on the ability of artificial neural networks determined longitudinal dispersion coefficient in natural rivers (Tayfour & Singh, 2005). Comparison of results of ANN model with real data shows its superiority than empirical relations of Fisher (Fisher et al., 1979), Kashefipur and Falconer (Kashefipur & Falconer, 2002) and Deong et al. (Deong et al., 2001). Correlation of coefficient of real data with predicted values of ANN model in training stage was 0.7 and root mean square error of 193 (Tayfour & Singh, 2005). Although several studies in environmental engineering used artificial intelligence (ASCE, 2000; Choi & Park, 2001; Chang & Chang, 2006; Maier & Dandy, 1996; Dezfoli, 2003; Rajurkar, 2004; Sadatpour et al., 2005; Karamouz et al., 2004; Lu et al., 2003), only Tayfour and Singh (Tayfour & Singh, 2005) used artificial neural network to estimate longitudinal dispersion coefficient in natural rivers so in this study inventionally a new methodology for estimation of longitudinal dispersion coefficient in rivers is developed and results of this new method is compared with previous empirical relations.

### 3.2. Fuzzy logic and fuzzy systems

In modern modeling methods, fuzzy systems and fuzzy logics have peculiar places (Zadeh, 1965). The most characteristics of these methods are the ability of implementing human knowledge by tongue labels and fuzzy rules, nonlinearity of these systems and adaptability of these systems (Jang, 1993). A fuzzy system is a logical system based on if-then fuzzy rules and initial point of building and developing a new fuzzy system is the derivation of set of if-then fuzzy rules knowledge of expert person or knowledge of modeling field (Dezfoli, 2003). Having a method or tool to achieve fuzzy rules from Numerical, statistical or tongue information is a suitable and simple method for modeling with fuzzy expert systems (Nayak et al., 2004).

Another, modern modeling method is the artificial neural network and most important ability of these methods is their training ability from train sets (proper input and output pairs). These methods use several training algorithms to extract the relations between input and output parameters (Tashnehlab et al., 2001). Based on the above statements, combining of fuzzy systems, which works based on logical rules, with artificial neural networks, which extract knowledge from numerical information, we can develop models that simultaneously use numerical information and tongue statements to model any phenomenon. This combined method of artificial neural network and fuzzy systems is named adaptive neuro-fuzzy inference system (Jang, 1995; Kisi et al., 2001; Gopakumar & Mujumdar, 2007; Sen & Altunkaynak, 2006).

A fuzzy system is a system based on logical rules of if-then statements. This system images input variable space to output variable space using tongue statements and a fuzzy decision making procedures (Jang, 1995; Dezfoli, 2003). Fuzzy rule sets is a set of logical rules that describes the relations between fuzzy variables and is the most important component of a fuzzy system (Karamouz et al., 2004). Because of the uncertainty of real and field data, a fuzzification transition used to transform deterministic values to fuzzy values and a diffuzification transition is used to transform fuzzy values to deterministic values (Maier & Dandy, 1996; Dezfoli, 2003). Most common types of fuzzy systems is the Sugeno fuzzy system in which fuzzy rules stored in a rule base station. The rules in this system are

Where A_{i}: are the fuzzy sets. In this system the if section of rule is a fuzzy value and the result section of the rule is a real function of the input values and usually is a linear statement such as: a_{1}x_{1} +a_{2}x_{2} +… +a_{n}x_{n} (Dezfoli, 2003).

### 3.3. Fuzzy logic and fuzzy systems

”ANFIS” statement which is the abbreviation of Adaptive neuro-fuzzy inference system is an adaptive fuzzy system which works based on artificial neural networks ability (Jang, 1995). This system is a fuzzy Sugeno by a forwarding network structure. Figure 1 shows a Sugeno fuzzy system with two inputs, one output and two rules and below it, the equivalent ANFIS system is presented (Tashnehlab et al., 2001). This system has two inputs X and Y and one output, where its rule is

If any layer in this system showed by an O_{j}(the output of i node in j layer), the ANFIS structure will have five layers (Jang, 1995). Based on the figure 1 the operation of these layers is:

* First layer, Input nodes:*every node in this layer is a fuzzy set and any output of any node in this layer corresponds to the membership degree of input variable in this fuzzy set. In this layer shape parameters determines the shape of the membership function of the fuzzy set (Zadeh, 1965). Membership functions of fuzzy sets usually showed by bell shape functions such as (Jang, 1993)

Where X: value of input to i node, and c_{i}, b_{i} and a_{i} are the parameters of membership function of this set. These parameters usually called if (condition) parameters.

* Second layer, rule nodes:*in this layer every node computes the degree of activation of any rules

Where _{Ai}(x): membership degree of x in A_{i} set, _{Bi}(x): is the membership degree of y in B_{i} set

* Third layer, medium nodes:*in this layer i node computes the ratio of activity degree of i rule to the sum of activation degrees of all rules

In this layer win: normalized membership degree of i rule.

* Fourth layer, consequent nodes:*in this layer output of any node is calculated

In this equation r_{i}, q_{i} and p_{i} are the adaptive parameters of layer and called consequent parameters.

* Fifth layer, output nodes:*in this layer every node computes the final output value of any node( number of nodes equals to output parameters)

In this way a fuzzy system which has the ability of learning can be developed. In this method, main learning algorithm is error back propagation algorithm. In this method, by error descending gradient algorithm, error value is propagated towards the input layers and nodes and model parameters adopted. Based on the figure 1 total output of this system can be written by a linear function of consequent parameters (Zadeh, 1965)

So using least square error method the consequent parameters can be determined. Also combining this method with error back propagation algorithm a hybrid method can bed developed which operates as follows. In this method, in any train epoch, moving forward, the outputs of nodes is calculated normally to forth layer and finally consequent parameters calculated based on the least square error method. In the next step, after calculation of the error, in backward movement, the ratio of error is propagated over if parameters and those values are adapted based on error descent Gradient method (Zadeh, 1965, Sadatpour et al., 2005; Nayak et al., 2004; Gopakumar & Mujumdar, 2007). In this study input parameters of developed model are flow width, flow depth, cross sectional average velocity, shear velocity and output parameter is the longitudinal dispersion coefficient of pollutant.

## 4. The database

Estimation of longitudinal dispersion coefficient in rivers using equations of table 1 or ANFIS models requires hydraulic and geometry data sets. In this study a wide range of published data in literature is reviewed and finally a data set is prepared. Using this data set the results of empirical equations and ANFIS are compared and assessed. The authors collected such data that have all required parameters in empirical equations. Table 2 shows the range of variation of collected data and its parameters. The data set was collected from several references such as (Li et al., 1998; Pourabadei & Kashefipur, 2007; Tayfour & Singh, 2005; Choi & Park, 2001; Chatila, 1997).

Parameter | Range | Average |

Flow velocity(m/s) | 0.034-2.23 | 0.7116 |

Flow depth(m) | 0.22-25.1 | 3.69 |

Flow width(m) | 11.89-201 | 137.74 |

Shear velocity(m/s) | 0.0024-553 | 0.0956 |

K x (m 2 /s) | 1.9-2883.5 | 223.1 |

From collected data set (73 series) 70% of them used for training of the ANFIS model and remaining 30% used for testing of the ANFIS model. Train and test sets selected randomly and optimum structure of ANFIS model is determined by default conditions in MATLAB commercial software and trial and error procedure. After developing several models with different structures, the optimum structure of the model is determined. The final optimum structure of the ANFIS model was using grid partitioning procedure for generating of fuzzy rules, Gaussian membership function with 4 input parameter and 3 membership function for any of input parameters with 30 epochs. Detailed description of developing ANFIS models with MATLAB is presented in several published papers (Riahi et al., 2007; Riahi & Ayyoubzadeh, 2007a; Riahi & Ayyoubzadeh, 2007b; Dezfoli, 2003; Sadatpour et al., 2005; Karamouz et al., 2004; Kisi et al., 2001; Nayak et al., 2004; Gopakumar & Mujumdar, 2007).

Tables and figures have to be made in high quality, which is suitable for reproduction and print, taking into account necessary size reduction. Photos have to be in high resolution.

## 5. Results and discussions

In this section, statistical parameters for accuracy assessing and final results of empirical relations and ANFIS model are presented. At first statistical parameters are described.

### 5.1. Statistical parameters

The results of empirical relation and ANFIS model assessed using statistical parameters such as: correlation coefficient (R^{2}), Mean absolute error (MAE), root mean square error (RMSE) and mean square error (MSE). These parameters show an average behavior of error in performance of the models and are global statistics that don’t show any information about the error distribution over results. Because of this reasons another two statistical parameters that can assess preciously the performance of models. These parameters, which not only show the performance of model in predictions by an index but also show the distribution of errors over all the results, are: Average Absolute Relative Error (AARE) and Threshold Statistics index (TS) (Maier and Dandy, 2006; FaghforMaghrebi and Givehchi, 2007). The TS_{x} index for x% of predictions shows the distribution of error in predicted values of any model. This parameter determined for different values of average absolute relative error. The value of TS for x% of predictions determined by

Where Y_{x}: is the number of predicted values (from total number of n) for every value of AARE less than x%. Mathematical equations of these statistical parameters are presented in (Maier & Dandy, 1996; FaghforMaghrebi & Givehchi, 2007; Kisi et al., 2001; Gopakumar & Mujumdar, 2007).

### 5.2. Results of empirical equations

The results of empirical equations in table 1 are calculated using all of collected data set and results of them are compared with measured data. Table 3 shows the results of empirical equations. Based on the results of table 3, none of these empirical equations have good results and shows considerable errors in comparison with measured data. The best empirical equation is the huang and li (Li et al., 1998;) with R^{2}=0.48, RMSE=295.7(M2/S), MAE=87439.6(m4/sec2), MAE=132.98(M2/S) and MAAE=68.46%. The values of these statistical indexes show the poor performance of empirical equations for prediction of longitudinal dispersion coefficients.

It is noticeable that based on the results of the table 3 and equations in table 1, poor performance is resulted from equation 8(Li et al., 1998;) that relates K_{x} directly with square of flow depth. But from physically based of the phenomenon K_{x} is function of the transverse velocity profile which reduces its effects with increasing of flow depth. Another result is that when flow depth or flow width eliminated from empirical equations, because of elimination of one of the most important parameters the results of these equations reduced considerably in comparison with similar equations. In this case equations of 7, 10, 12 and 14 can be addressed. Also it is clear that the effects of average velocity of flow on K_{x} are more than the flow width. For example equation 17 without presence of flow width is clearly better than equations without presence of flow velocity such as: 10, 12 and 14.

Figure 2 shows the distribution of error in predicted values by empirical equations. The equations of 7, 8 and 10 with poor performance eliminated from this figure and also bound of maximum error threshold in some equations was greater than 5000%, the upper bound of x-axis was set to 500%. From this figure it is clear that for 50% of predicted values, error is greater than 100% which is very high and equation 15 (the best one) have 300% error for 100% of predicted values, but all of the other equations have 500% errors for 100% of predicted values.

Statistical Parameter | |||||

Author(year) | R 2 | RMSE | MSE | MAE | (%) |

Elder(1959) | 0.12 | 452.5 | 204752.5 | 217.7 | 97.18 |

Quien and quifer(1979) | 0.01 | 598974.19 | 35870077 | 118320.3 | 51798.3 |

Fisher(1976) | 0.44 | 1891.7 | 3578526.49 | 833.71 | 331.5 |

Liu and Chen(1980) | 0.10 | 455.43 | 207419.03 | 218.72 | 93.12 |

Liu(1980) | 0.35 | 472.95 | 223345.63 | 238.39 | 179.3 |

Awasan and (1991) | 0.30 | 335.46 | 112535.11 | 148.87 | 191.31 |

Seo and chang(1998) | 0.42 | 1022.25 | 1044996.01 | 433.50 | 637.2 |

Kasiez and Rodriguez(1998) | 0.28 | 481.92 | 23246.29 | 262.61 | 259.87 |

Huang and li(1999) | 0.48 | 295.7 | 87439.6 | 132.98 | 68.46 |

Deon et al.(2001) | 0.38 | 841.83 | 708674.88 | 352.86 | 169.2 |

Kashefipur and falconer(2002) | 0.35 | 909.31 | 826843.83 | 330.39 | 496.83 |

Tavakolizadeh(2007) | 0.44 | 376.66 | 141874.83 | 172.17 | 89.92 |

### 5.3. ANFIS model results

Using collected data set, a new model for prediction of longitudinal dispersion coefficient in natural rivers is developed based on the ANFIS method. The results of this new model are presented in figures 3 to 6 in train and testing steps and the statistical results of this model

are presented in table 4. The input parameters of this model are: flow width, flow depth, average velocity and shear velocity and output parameter is the longitudinal dispersion coefficient. Figs of 3 to 6 show that the ANFIS model accurately learned the dispersion processes in natural rivers and predicted K_{X} values accurately. The ANFIS model extracted the dominant phenomena of pollutant transport in natural rivers and simulated its longitudinal dispersions. Comparison of the results of ANFIS model (table 4) with the results of empirical equations (table 3) shows the superiority of the ANFIS model in prediction of K_{X} values in rivers. Figure 7 compared the error distribution of ANFIS model in train and test steps with the results of best empirical equations in table 3(equation 15 and 18).

Stage model developing | Statistical Parameters | ||||

R 2 | RMSE | MSE | MAE | (%) | |

Training Stage | 0.9957 | 15.18 | 230.43 | 8.66 | 63.48 |

Testing Stage | 0.9084 | 187.8 | 35240.14 | 104.77 | 127.68 |

Base on the results of ANFIS model in figure 7, in 70% of predicted cases the error of ANFIS model in training step is less than 100% and is lesser than from results of the 18 and 15 equations. Also in test step based on the table 4 and figure 7 it is clear that the results of the ANFIS model are better than empirical equations. Good performance of NFIS model in comparison with empirical equations in prediction of K_{X} values never else of limit number of data series in train and testing steps, wide range of variation of data set parameters and simple and quick developing of ANFIS model, shows the high ability of this model for prediction of K_{X} values rather than empirical equations without any needs for mathematical equations of the phenomena or numerical solving of them. The results of this study shows that ANFIS model can be used as alternative precious method for prediction of longitudinal dispersion coefficients.

## 6. Conclusions

In this chapter the authors have investigated the method and available equations for prediction of longitudinal dispersion coefficient in natural rivers and collected a data set to evaluate the performance of these equations. Based on the results, none of these empirical equations have good results and show considerable errors in comparison with measured data. The best empirical equation is the huang and li (Li et al., 1998) with R2=0.48, RMSE=295.7(M2/S), MAE=87439.6(m4/sec2), MAE=132.98(M2/S) and MAAE=68.46%. The values of these statistical indexes show the poor performance of empirical equations for prediction of longitudinal dispersion coefficients. In 50% of predicted values the error of these equations is greater than 100% and is very high and equation 15 (the best one) have 300% error for 100% of predicted values, but all of the other equations have 500% errors. Using collected data set, a new model for prediction of longitudinal dispersion coefficient in natural rivers is developed based on the ANFIS method. The input parameters of this model are: flow width, flow depth, average velocity and shear velocity and output parameter is the longitudinal dispersion coefficient. The results show that the ANFIS model accurately learned the dispersion processes in natural rivers and predicted K_{X} values accurately. The ANFIS model extracted the dominant phenomena of pollutant transport in natural rivers and simulated its longitudinal dispersions. Comparison of the results of ANFIS model (table 4) with the results of empirical equations (table 3) shows the superiority of the ANFIS model in prediction of K_{X} values in rivers. Base on the results of ANFIS model, in 70% of predicted cases the error of ANFIS model in training step is less than 100% and is lesser than from results of the 18 and 15 equations. good performance of ANFIS model in comparison with empirical equations in prediction of K_{X} values never else of limit number of data series in train and testing steps, wide range of variation of data set parameters and simple and quick developing of ANFIS model, shows the high ability of this model for prediction of K_{X} values rather than empirical equations without any needs for mathematical equations of the phenomena or numerical solving of them. The presented methodology in this chapter is a new approach in estimating dispersion coefficient in streams and can be combined with mathematical models of pollutant transfer or real-time updating of these models.