Peak ozoneconcentrations (ppm)for theyears2002, 2003, 2004and2005 (Semades, 2005)
Advances in mathematical models to describe the formation, emission, transport and disappearance of air pollutants have led to a greater understanding of the dynamics of these pollutants. However, the more complex the model, the more information is required for their application to have sufficient certainty that the results will have technical or scientific value (Russell & Dennis, 2000). These deterministic models require much information that is not always possible to obtain; the data available have not always resulted in successful outcomes upon application of the model (Roth, 1999), or the cost of obtaining reliable data can be prohibitive (Pun & Louis, 2000).
There are other methods requiring less information that can be used to study air pollution in some areas. These methods generally make use of statistical techniques such as regression or other data-fitting methods using numerical techniques to establish the respective relationships between the various physicochemical parameters and variable of interest based on routinely-measured historical data.
The main objectives of these methods include investigating and assessing trends in air quality, making environmental forecasts and increasing scientific understanding of the mechanisms that govern air quality (Thompson et al., 2001).
Among the techniques being examined to relate air quality in a given area to measured physical and chemical parameters, the three that have been used most often are i) multivariate regression (Hubbard &Cobourne, 1998, Comrie& Diem, 1999, Davis &Speakman, 1999; Draxler, 2000, Gardner & Dorling, 2000), ii) artificial neural networks (ANN) (Perez & Reyes, 2006; Brunelli et al., 2006; Thomas &Jacko, 2007 ; Grivas&Chaloulakou, 2005; Gardner & Dorling, 1999), and iii) time series and spectral analysis (Raga & Moyne, 1996, Chen et al., 1998; Milanchus et al., 1998, Salcedo et al., 1999, Sebald et al., 2000).
Artificial neural networks have greater flexibility, efficiency and accuracy, since they have a large number of features similar to those of the brain; i.e., they are capable of learning from experience, of generalizing from previous cases to new cases, and of abstracting essential features from inputs containing irrelevant information; they use adaptive learning, one of the most attractive features of ANN, as well as the ability to learn to perform tasks based on training or initial experience. ANN do not need an algorithm to solve a problem because they can generate their own distribution of the weights of the links through learning and are easily inserted into the existing technology. Because of these characteristics, ANN generally has low computational requirements and their construction is less complex.
The pollutant of interest in this study is tropospheric ozone, as it is the main component of a type of air pollution known as smog or photochemical smog. According to the National Ecology Institute (NEI), the Metropolitan Zone of Guadalajara, Mexico (GMA) is in second place in Mexico in exceeding the NOM-020-SSA1-1993Mexican air pollution standard. Tropospheric ozone is one of the five major pollutants with harmful effects on human health, causing respiratory problems and ailments such as headaches, and eye irritation as well as affecting vegetation, metals and construction materials, dyes and pigments.
1.1. Tropospheric ozone formation
Photochemical smog is formed through a photochemical process from a combination of gases in the troposphere, such as nitrogen oxides (NOX, i.e., NO and NO2), volatile organic compounds (VOCs) and carbon monoxide (CO), as has been documented (Seinfeld, 1978; Boubel, 1994 & Godish, 1991, as cited scientist in Comrie, 1997).
The sequence of events begins in the early hours of the morning when a heavy emission of hydrocarbons (HC) and nitrogen monoxide (NO) is produced at the start of human activity in large cities (heaters are turned on, and traffic density increases). Nitric oxide (NO) is oxidized to nitrogen dioxide (NO2), increasing the concentration of the latter in the atmosphere. Higher concentrations of NO2 together with increasing solar radiation as the morning wears on starts thephotolytic NO2cycle, generating atomic oxygen which, as it is transformed into ozone, leads to an increase in the concentration of oxygen and hydrocarbon free radicals. These, when combined with significant amounts of NO, cause NO in the atmosphere to decrease.
This impedes completion of the photolytic cycle, rapidly increasing theozone (O3) concentration (Comrie, 1997).
These relationships can be expressed conceptually; the polluted urban atmosphere contains approximately one hundred different hydrocarbons, olefins being the most reactive. The result of the atomic oxygen attack on the olefin produces two free radicals. In the case of propylene, the first stage of the reaction is the addition of oxygen to the double bond to give a reactive complex (1)
The more likely reaction is (2), since it implies less regrouping of the activated complex than (H). and radicals quickly form formaldehyde and acetaldehyde, respectively. Reactions (2) and (3) are the initial stages of a chain process
The chain reaction enables rapid oxidation of NO to NO2 by alkoxyl radicals () and peroxyacyl () without the intervention of atomic oxygen and O3, which provides some explanation for the changes observed in the concentration of gaseous pollutants during the day.
When atmospheric concentrations of hydrocarbons increase because of motor vehicle activity, the photolytic cycle of NO2 is disturbed and NO is oxidized to NO2 by the chain reaction involving the hydrocarbon radical (equations 2–8). As a result, the constant low O3 concentration found in the photolytic cycle of NO2grows, and ozone is not consumed in the oxidation of NO to NO2 (Seinfeld, 1978).
As the morning advances, solar radiation promotes the formation of photochemical oxidants, increasing their concentration in the atmosphere. When concentrations of precursors (NOX and HC) in the atmosphere are lowered, the formation of oxidants stops and their concentrations decrease as the day progresses. Hence, photochemical pollution in cities builds up mainly in the mornings.
Due to industrial development in the GMA in recent years, there has been an urban–green–industrial zoneimbalance, leading to the generation of various kinds of pollutants that alter the quality of the environment and exceed the assimilative capacity of the ecosystem.
Given this situation, it is vital to have a mathematical model that correctly predicts ozone concentrations at any given time, as this will help determine preventive measures and/or corrective actions to prevent exposure to high ozone concentrations. These models are able to relate air quality to certain other specific parameters of the air shed, such as emission levels and weather conditions.
2. Data sources
From ananalysisofreportsfrom2002–2005, it was determinedthat the highestozone concentrationswerein the southern area of the GMA, so specific data for meteorological and chemical variables were obtainedfrom theMiravalle weather station, locatedinthe south. These are shown in Table 1.
2.1. Meteorological and chemical variables
Meteorological data for the period April 1999 to June 2005 were obtained from the Mexican National Weather Service (MNWS). These data consist of averages over time intervals ranging from 0 to 23 hrs.
The meteorological variables are Wind Direction (average and maximum average) (degrees), wind speed (average and maximum average) (km/h) Average Temperature ( C), Relative Humidity (%). Barometric Pressure (mbar), Precipitation (mm) and Solar Radiation (W/m2).
The data were obtained from the Chapala station,which belongs to the Automatic Monitoring Stations (AMS) system.
Data on the following chemical variables were provided by the National Ecology Institute (NEI) for the Miravallestation; Ozone, Nitrogen Oxides— NOX and NO2, as shown in Figure 1.
3. Selection of meteorological and chemical variables
Meteorological and chemical variables used to carry out ground-level ozone forecasts were selected based on existing knowledge from the scientific literature and an analysis of correlations between different variables, and on availability of data from monitoring stations.
3.1. Analysis of meteorological variables
3.1.1. Wind speed
Atmospheric movements of the air (i.e. winds) are responsible for the spread of high concentrations of pollutants (in this case the O3 and its precursors) through the atmosphere, but this may or not occur quickly, because if the winds are calm, i.e., the wind speed is low and the topography traps the air mass, pollutants can not disperse. More pollutants continue to accumulate and their concentration can reach very high levels. In contrast, if wind speeds are high, the pollutants tend to disperse quickly (Melas et al., 2000).
This variable has shown a strong correlation with the concentration of ozone. The basic reasoning is that photochemical reaction rates are sensitive to temperature, so that increasing the temperature in the troposphere stimulates a series of interlinked reactions that contribute to ozone formation. (Garcia, 2003)
3.1.3. Relative humidity
Water vapor is one of the most basic components of the atmosphere. Its amount can be quite variable. It is important because it is one of the atmospheric elements which most absorbs solar radiation, preventing it from interacting with the primary pollutants and forming secondary pollutants such as ozone (Ayllón, 1996).
This process is one of the main ways that pollutants are removed from the atmosphere, but as a result, pollutants removed from the air then contaminate the earth’s surface, which in some cases results in their becoming even more active due to their effects on surface water, plants and materials (acid rain) (Melas et al., 2000).
The relationship between temperature and pressure is that the vertical motion of air is determined by vertical variation in temperature in the troposphere; temperature decreases at a rate of 0.64 ºC per 100 m of altitude. Thus, the earth’s surface warms the air parcel next to it, and this hot air expands, becoming less dense than the cooler air above it. The warm air rises and cool air takes its place to then be heated in turn, making contact with the surface, and subsequently also rises. This creates air currents (vertical mixing) that contribute to the dispersion of pollutants (Rodríguez & Tenorio, 2006).
3.1.6. Solar radiation
This is the factor that has the greatest effect on photochemical reactions, i.e., it is involved in the formation and destruction of the various compounds involved in the increase of tropospheric ozone (Melas et al., 2000).
Photochemical dissociation in the atmosphere can be considered as a two-step process. The dissociation energy of a photon by a molecule causes it to be in an excited state, and the excited product disassociates into new products that can be highly reactive, generating photochemical smog (Wark & Warner, 2000), as explained in Section 1.1.
3.2. Analysis of chemical variables
Many pollutants are highly persistent,and it is generally accepted that the probability of pollution episodes is increased if the previous day’s pollution levels were higher than normal.
In this study, the previous day’s maximum O3 and NOX are used as chemical input variables (Melas et al., 2000).
3.2.1. Previous day’s ozone
Even when the tropospheric ozone photolytic cycle is considered to be in equilibrium (generation and degradation of ozone in equilibrium with NOx), when hydrocarbons are involved (equations 2–8) the ozonegenerated is not consumed in the oxidation of NO to NO2 (Seinfeld, 1978). There is ozone remaining from the previous day, which should be input to the structure of the neural network model as an initial ozone concentration on the day of interest.
3.2.2. Oxides of nitrogen
These variables are the main precursors to ozone formation. Oxides that are present in the atmosphere in significant quantities are nitrogen monoxide and nitrogen dioxide (NOx = NO + NO2); approximately 90% of them are destroyed by photolysis in the formation of ozone (Wark & Warner, 2000).
These variables experience higher photolytic breakdown between 11 a.m. and 3 p.m.; i.e., when there is a higher incidence of light, after which time the levels start to gradually rise. Thus the concentrations which remain following the photolytic period participate as raw material for the formation of ozone the next day. Therefore, the maximum concentration between 3 p.m. and 11 p.m. on the previous day is used as the input variable (Seinfeld, 1978).
3.3. Statistical analysis
In addition to the analysis described in 2.1 and 2.2, the selection of variables was based on 1) individual regressions between the variable of interest (maximum ozone) and the various parameters (temperature, humidity, etc.), selecting only those with correlation coefficients (r) 0.3 and greater (Garcia, 2003); and 2) the t-ratio (t), used to obtain the degree of importance of each of the variables with respect to the dependent variable, namely ozone (see Table 2).
This analysis indicates that the higher the absolute value of t, the more necessary the variable (Miller et al., 1992) in the ANN model. All values of meteorological variables correspond to the day in question, so if models are to be used in real time, the values for that day are required. The values of the chemical variables correspond to the day previous to the day of interest.
With the results shown in Table 2, the database used had daily maximum temperature values (º C) Solar Radiation (W/m2), nitrogen dioxide (NO2) on the previous day and oxides of nitrogen (NOX) on the previous day, the latter two measured from 3 p.m. to 11 p.m.; and daily average values of wind speed (km/h), humidity (%) pressure (mbar) and precipitation (mm).
4. Artificial neural networks
The artificial neural network employed was a multilayer backpropagation network, which has been used successfully in several studies (Garcia & Shigidi, 2005, Kuo et al., 2003, Helle et al., 2001; Yesilnacar et al., 2007; Yetilmezsoy & Demirel, 2007).
The important feature of this network is its ability to self-adapt the weights of neurons in intermediate layers to learn the relationship between a set of patterns given as examples and their corresponding outputs, so that after having been trained, it can apply the same relationship to new input vectors and produce appropriate outputs from inputs that the system has never seen before, a feature known as the generalizability of an ANN (Mehrotra et al., 1997).
This type of network consists of three layers. There is an input layer Ai with m neurons, an output layer Ck withoneurons and at least one hidden layer Bj with n internal neurons. Each neuron in a layer (except the input layer) receives inputs from all neurons of the previous layer and sends its output to all neurons of the next layer (except the output layer), as shown in Figure 2 (Comrie, 1997).
The learning algorithm involves a forward propagation step in which the input pattern is presented to the network and propagated through the layers until it reaches the output layer, followed by a backward propagation step in which errors are passed from the output layer back to all the neurons in the intermediate layer and then back to the input layer, adjusting the synaptic weights so that the system converges (Gurney, 2003). In this way the network learns to recognize different features of the input patterns (Freeman et al., 1993, Lek et al., 2000). The important point is that each iteration of the ANN decreases the error between the actual data and forecast values.
To develop the ANN, we used the MATLAB version 220.127.116.117 computer program (R2007a), specifically the Neural Network Toolbox toolkit (Wang et al., 2006, Yetilmezsoy & Demirel et al., 2008, Garcia et al., 2008).
4.1. Parameters used to build artificial neural network models
The training algorithm selected for use in the ANN models was the Levenberg-Marquardt algorithm, because it achieves rapid convergence (TRAINLM) (Beale et al., 2010, Yetilmezsoy & Demirel et al., 2008 Yesilnacar et al., 2007, Wang et al., 2006) with a learning rate of 0.001. It is worth noting that when the learning rate increases or decreases, the performance of the models are neither improved nor impeded, so the 0.001 value was maintained. The ANN models were trained with 10,000 iterations on the training data (Comrie, 1997; Guardani et al., 1999).
To evaluate the results of the ANN models, three performance features were considered; the mean square error (MSE), mean squared error with regularization (MSEREG) and the error sum of squares (SSE). In the hidden layer, a log-sigmoidal function (LOGSIG) was used, and in the output layer, the transfer function was linear (PURELINE).
4.2. Number of hidden layers and hidden layer neurons
ANN models generally have acceptable performance with three layers; input, hidden and output (Del Brio & Sanz, 2001, Yetilmezsoy & Demirel et al., 2008, Helle et al., 2001). Deciding the number of neurons in the hidden layer is usually not so obvious, so the decision was based on the rules suggested by Goethals et al., (2007). The number of hidden neurons is based on the number of input variables (Ni) and output nodes (No) as shown in Table 3.
|2/3 * Ni||2/3 * (9) = 6|
|0.75 * Ni||0.75 * (9) = 6.75 ≈ 7|
|0.5 * (Ni + No)||0.5 * (9 + 1) = 5|
|2 * Ni + 1||(2 * (9)) + 1= 19|
|2 * Ni||2 * (9) = 18|
With these rules, and various studies predicting tropospheric ozone (Comrie, 1997; Guardani et al., 1999; Hooyberghs et al., 2005, Jiang et al., 2004; McKendry et al., 2002; Melas et al., 2000, David & Speakman, 1999, Hubbard & Cobourn, 2001), four arrays were created for training and testing the ANN models, each with a different number of neurons in the hidden layeras shown in Table 4. This is because there is no rule for the optimal number of neurons in the hidden layer (Yetilmezsoy and Demirel et al., 2008, Helle et al., 2001). In each problem, different arrangements should be tried for organizing the internal representation, selecting the one that gives the best results according to the stated objectives.
5. Development of artificial neural network models
Four different ANN structures were created; 9×6×1 (nine input signals, six nodes in the transfer layer and one node in the output layer) (R-6); 9×10×1 (R-10), 9×12×1 (nine input signals, twelve nodes in the transfer layer and one node in the output layer) (R-12) and 9×15×1 (R-15).
In training, the three performance criteria mentioned above were used; mean square error (MSE), mean squared error with regularization (MSEREG) and error sum of squares (SSE).
As a source of data for the training process, a database was used which contained daily maximum temperature values ( C), Solar Radiation (W/m2), nitrogen dioxide (NO2) on the previous day, and nitrogen oxides (NOX) on the previous day; the latter two for the period 3 p.m. to 11 p.m., and daily average values of Wind Speed (km/h), Humidity (%) Pressure(mbar) and Precipitation (mm) for the period 1999–2004. The database contained 2065 validated data for each variable.
5.1. Training of ANN models
The models were trained on 1990–2004 data. Real and predicted O3 values were classified into three concentration rangesaccording to the NOM-020-SSA1-1993 standard, quantifying the number of data falling within each range (percentage of correct answers) in order to estimate the performance of each model.
This was done under the assumption that it would be very difficult to try to get specific concentrations from models. The concentration ranges were: low <0.06 ppm, intermediate 0.06–0.11 ppm and high >0.11 ppm, the latter corresponding to the maximum allowable range of the rule.
The number of observed values and estimated values in each of these ranges were counted to find how many times the model made a correct estimate.
In order to refine the 12 trained models, an analysis based on the correlation coefficient (r) was carried out. A linear regression was performed between real ozone and estimated ozone as shown in Table 4. Based on the results, the models with 12 and 15 neurons in the hidden layer were chosen, as they had the best correlation between the calculated and actual data, in both cases using MSE to evaluate performance.
|No. of neurons in hidden layer||Performance criterion|
Figure 3 and 4 shows correlation plots of the networks with 12 and 15 neurons in the hidden layer, respectively, during the training phase using 1999–2004 data. The dispersion of clouds of points is very similar in the two cases, corresponding to correlation coefficients close to each other.
5.2. Performance of ANN models
Once training was completed, performance was evaluated with 2005 data. This second database has 173 validated values for each parameter under the same conditions as the training database, containing daily data from January to June 2005.
Table 5 shows the number and percentage of times that correct concentrations were obtained by the networks with respect to the observed values, where R-12 and R-15 refer to networks with 12 and 15 neurons in the hidden layer respectively.
|R-12||41 (52%)||33 (43%)||0 (0%)|
|R-15||50 (63%)||48 (62%)||2 (12%)|
Figure 5 shows the ANN regression models corresponding to 12 neurons in the hidden layer (R-12), and Figure 6 shows the ANN model with 15 neurons in the hidden layer (R-15), both models with data from January to June2005.
The performance of the two networks is very similar. However, it is clear that it is difficult for both models to detect ozone concentrations exceeding the standard, which is important for this study.
In order to remedy the estimation problem in the high concentrationrange, it was decided to scale the value estimated by the network. Thus, the final estimated value of ozone (O3) is calculated as
whereα is the scaling factor and O3 is the ozone concentration (ppm) estimated by the neural network model. The value of α was obtained by an incremental search of values that when the equation was applied, reached the number of times that the observed concentration standard for each year was exceeded, without knowing (until then) if the times when estimated concentration exceeded the standard also corresponded to the days when this actually occurred. This process yielded an average value of α for both models of 1.21. Thusozone values estimated by each network were multiplied by 1.21, giving the results in Table 3, which shows that efficiency is lost in the lower range at the expense of a gain in the intermediate and high ranges (which are of greatest interest).
|R-12 (α=1.21)||18 (23%)||57 (74%)||11 (65%)|
|R-15 (α=1.21)||23 (29%)||57 (74%)||8 (47%)|
Table 6 shows that the overall performance of the networks is 50%, and 64% for the detection of elevated ozone concentrations with R-12, and 47% with R-15. It may be noted that using 12 neurons in the hidden layer fails to detect a greater number of days in the high range.
With the results, it was decided to work with the model with 12 neurons in the hidden layer; this model was selected in the present study for predicting tropospheric ozone concentrations in the metropolitan area of Guadalajara, Jalisco. Figure 7 shows a comparison with the previously selected model without the scaling factor. As can be seen graphically, the model shows a good trend in the low and intermediate range of tropospheric ozone concentrations but a poor performance for the higher concentrations.
In Figure 8 a scaling factor of 1.216 has been applied. Although accuracy is lost in the low range of concentrations, performance in the midrange is improved from 43% to 74%, that is, by 31 percentage points.
In the high range, it is clear that after the unscaled model did not detect any concentrations, after applying the scaling factor, efficiency was improved by 65%, i.e., it predicted 11 of the 17 days that exceeded the concentration of 0.11 ppm.
With the results obtained by the model and selected variables, it was concluded that in the GMA, the most importantmeteorological variables for significantlyreducing the tropospheric O3 concentration are wind speed, which can disperse ozone precursors or ozone itself, decreasing the concentration of the pollutant; and rainfall, as this will wash out the atmosphere thereby lowering the concentration of ozone, as well as the concentrations of precursors and other pollutants (it can be observed that the morning after a day with rainfall tends to be clear in the early hours).
The chemical variables that are important in increasing ozone concentration are maximum temperature, maximum solar radiation, O3 on the previous day, and oxides of nitrogen (NOX and NO2) (Gomez et al., 2006) because they are involved directly in the photochemical cycle of ozone formation (Wark & Warner, 2005).
The important variables related to nitrogen oxides (NO2 and NOX) are maximum values, since these elements are directly involved in the photolytic cycle and influence the formation of ozone. They are measured from 3 p.m. to 11 p.m. because ozone formation takes place between 11 a.m. and 3 p.m., so there is no consumption of these two pollutants and accumulations of these oxides in the afternoon will serve as raw material for the next day.
Along with the physical and chemical meteorological variables involved in increasing or decreasing O3 concentrations, the characteristics of the basin of the Jalisco Basin are also important. It is located 1583 meters above sea level, surrounded by the Sierra Madre Occidental, plateaus and the Neovolcanic Belt, and with industrial parks to the NW, SE, and SW. The Sierra Madre Occidental is formed by the Los Huicholes, Los Guajolotes and San Isidro mountains, the Gordo hill and the Tequila volcano. The Neo-Volcanic or Transversal Volcanic Belt includes the Cacoma, ManantlánTapalpa and Lalo mountains, among others. Other notable peaks are El Tigre and Garcia, Cerro Viejo, the Tequila Volcano and to the south, the Nevado de Colima mountain and Colima Volcana, which create a particular basin structure and formation and dispersal patterns of specific pollutants that directly affect the GMA in terms of the model performance.
The ANN models perform acceptably for predicting ozone in the lower and intermediate range, however the aim of this study was to predict high levels of ozone, so it was necessary to use the scaling factor so that the models would be able to predict concentrations in the high range. This scaling factor was obtained in training the models by matching observed and predicted days that exceeded the standard. Using the scaling factor, the modelobtained can predict maximum O3 concentrations in ppm with an overall efficiency of approximately 50%, and 65% for the detection of high concentrations.
These models were trained on data from the period between 1999 and 2004 in the Guadalajara Metropolitan Area (GMA) and their performance was evaluated using data from the period from January to June 2005 with data from the SMN and INE. The performance of the two models was assessed and compared by comparing forecast and actual ozone concentrations in three ranges (called percent of correct answers) related to the NOM-020-SSA1-1993 standard, low (< 0.060 ppm), intermediate (0.0600.110 ppm) and high (> 0.110 ppm), with high concentrations considered to be values that exceeded the standard.
The general characteristics of the selected ANN model for forecasting of O3 are: 9 independent variables (3 chemical and 6 weather). The structural arrangement of the network was 9×12×1 (input × hidden × output); transfer functions were sigmoidal in the hidden layer and linear in the output layer, the training function was TRAINLM; the performance criterion was mean square error (MSE) and the scaling factor was 1.21.
This model is able to predict 22% of concentrations lower than 0.060 ppm, i.e. it predicted 17 of 79 days for this range; a 74% success rate in the intermediate range of concentrations from 0.060 to 0.110 ppm, i.e. 57 days of the 77 days recorded; and 65% success for concentrations greater than 0.110 ppm, i.e. 11 of the 17 days recorded for the 2005 period.
The overall efficiency of the model for the period January to June 2005 was 49.13% with the scaling factor and 54.34% without the factor.
The models obtained employed the meteorological variables maximum temperature and solar radiation, and average values of wind speed, barometric pressure, rainfall, relative humidity for each day of interest, and maximum values of the chemical variables ozone, NOx and NO2 on the previous day (measured from 3 p.m. to 11 p.m.).
Finally, the models generated are easy to implement, have only moderate technological requirements and simple, easily understood structures, giving them minimal operating costs. These models can be used to help alert the community at times when the air quality is undesirable, so that precautionary measures can be taken to safeguard the health of the population.