Validation of air pollution dispersion model is very important process. It determines performances and efficiency of model in well defined conditions. Conditions consist of type of terrain orography (flat or complex), size of domain (local, regional, continental, global), number of grid cells in domain, meteorological conditions (strong or weak winds, etc.) and emission types (stacks, traffic, domestic heating). Results of validation give good guidelines how, where and when model can be successfully applied.
Validation is especially important when model is used for regulatory purposes. FAIRMODE European guidelines for air pollution modelling explicitly require that modeling tool must be successfully validated in similar environment (FAIRMODE, 2010). Slovenian legislation (Ur.l. RS, št. 31/2007, 2007) that is following European Council Directive of 28th June 1984 on combating air pollution from industrial plants (EUR-Lex 84/360/EEC, 1984) requires that the modeling tool for reconstructions of air pollution around stationary industry sources meet the requirements of complex terrain because most of Slovenian industry is located in the bottom of basins, river canyons and valleys. Complex terrain defines a set of specific atmospheric conditions: low wind speeds, temperature inversions, flow over topography, presence of terrain obstacles or discontinuities (land-sea, urban-rural environment), etc. Lagrangian particle dispersion model is the only air pollution model at the moment that is successfully achieving these requirements (Wilson and Sawford, 1996, Schwere et al., 2002). It has significantly evolved in last years and moved from research usage to usage for operational regulatory purposes (Tinarelli et al., 2000, Graff, 2002).
Validations over complex terrain are still very rare. They are very important for research community and governmental environment agencies. Research community use the results for further developments and improvements of modeling techniques and environment agencies for setting up and implementation of regulatory policies.
A study has been made to improve traditional air pollution model validation methodology. It is upgraded to estimate inaccuracy in position and time of the Lagrangian particle air pollution dispersion model. New validation methodology has been demonstrated on a field from a very complex terrain from Šaleška region (Slovenia). For validation Lagrangian particle air pollution dispersion model SPRAY produced by ARIANET Srl from Milano, Italy is selected. It has been chosen for validation because it follows Slovenian legislation about air pollution modeling over complex terrain. Validation is performed on one very complex terrain air pollution situation that is very difficult for reconstruction and includes phenomenon of air pollution accumulation. Traditional statistical indexes are determined at four locations in different directions from the point of view of air pollution source. To estimate model’s inaccuracy in position and time new enhanced validation methodology is demonstrated and described in details. Results of this validation will serve for future improvements of selected air pollution dispersion model.
2.1. Traditional validation methodology
Traditional validation methodology for air pollution modeling is based on statistical comparison between measured and reconstructed data about air pollution concentrations in environment. It is well described in model validation framework named “Model evaluation toolkit” that has been established and maintained by Olesen (1996).
Measured data are collected from automatic environmental measuring stations located on the area of interest (domain) usually around sources of air pollution. Reconstructed concentrations are obtained from the air pollution modeling simulation.
In the air pollution model usually area of interest consists of a grid of cells where each cell describes average air pollution situation in certain part of the domain (i.e. in presented study case in next chapter domain is split into 100 x 100 cells in horizontal and in 20 layers in vertical which give 200 000 cells for the domain). For the comparison reconstructed average concentration from the ground cell where measuring station is located is taken. An example is presented on Figure 1.
Statistical analysis of data is performed for selected time interval where measured and reconstructed data are available. For this time interval a set of data patterns must be prepared. Each data pattern from this set consists of a pair of measured and reconstructed concentraion obtained at time step t as presented in equation (1).
Using traditional validation methodology most often three statistical indexes are determined:
the correlation coeficient (CR):
the normalized mean square error (NMSE):
the fractional bias (FB):
Definitions of variables and functions for determination of statistical indexes:
C meas (t)...measured concentration at time step t
C recon (t) ...reconstructed concentration at time step t
Ĉ meas ...average measured concentration
Ĉ recon ...average reconstructed concentration
σ C ...standard deviation of (measured or reconstructed) concentrations
t...time step index
T...length of full time interval (number of measured concentrations)
2.2. Enhanced validation methodology
In the model validation framework named “Model evaluation toolkit” maintained by Olesen (1996) difficulties that can arise in model validation are outlined. Differences between measured and reconstructed concentrations are caused by measuring errors, inherent uncertainty, input uncertainty and model formulation error. In the paper by Grašič et al. (2007) it has been determined that inaccuracy in position and time exists in the model. To estimate these inaccuracies enhanced validation methodology is presented. It is based on methodology where additionally reconstructed ground level concentrations in neighboring cells of the cell where station is located are also used in validation. Each measured value is during enhanced validation compared with one reconstructed concentration selected from a set of reconstructed concentrations. Set of this reconstructed concentrations NC as described in equation (5) consists of average concentration in the cell where station is located and neighboring cells. Neighborhood is defined in position (space) (i.e. for neighborhood of 1 cells in position we create a set of 9 cells as presented on Figure 2 and equation (6)) and in time scale (i.e. neighborhood of 1 time interval consist of 3 time intervals as presented on Figure 3 and equation (7)).
Definitions of variables for determination of set of neighborhood concentrations NC:
NC...set of reconstructed concentrations in the station's neighborhood
t...time step index
ΔT...length of neighborhood in time scale (number of time steps)
m...index (number) of cell in east-west direction
n...index (number) of cell in east-west direction
ΔH... length of neighborhood in position (space) (number of cells)
Finally in enhanced validation methodology each measured value is compared with one reconstructed concentration selected from a set of neighborhood concentrations NC. From this set of reconstructed concentrations one concentration C BMrecon is selected using best matching function according to measured concentration as described in equation (8). Best matching function selects one element from NC set where difference of this element and measured concentration in lowest compared to other elements in NC set.
3. Study case
Presented method is demonstrated on a field data set from a complex terrain. In the following sub-chapters field data set from Šaleška region (Slovenia) is described. Field data set from Šaleška region has been chosen for several reasons:
The second reason are high emissions from thermal power plant which were about 100000 tons of sulphur dioxide SO2 and 12400 tons of nitrogen oxides NOx per year (Elisei, 1991) because no desulphurization plant has been installed at that time. These high emissions represented the main air-pollution source in the region where ambient SO2 concentrations higher than 1 mg/m3 were measured at surrounding automatic environmental measuring stations. All other local source of air pollution can be practically neglected for this reason. Experimental campaign had been therefore organized as a tracer experiment.
The third reason is the availability of all measured data from environmental automatic measuring stations and emission station for the whole period of measuring campaign. Complete database is available in final report (Elisei et al., 1991) and also on the internet web page (Šoštanj91 Campaign home page, 2007).
And the fourth reason is that database obtained during the campaign had been used to validate several available air pollution models (Grašič, 2007).
Study case continues with description of air pollution modelling and comparison of validation results using standard and new presented method.
3.1. “Šoštanj91” field data set description
An experimental measuring campaign named had been performed in spring of year 1991 from 15th of March till 5th of April 1991 in surrounding of Thermal power plant Šoštanj (TPPŠ). Main purpose of the campaign was determination of environmental impact of the air pollution from the three stacks of thermal power plant. The emphasis has been on the meteorological conditions that cause severe air pollution episodes.
TPPŠ is located in the centre Šaleška valley as presented on Figure 5. In the central part of Šaleška valley there is a plain located north of Paka river. Average altitude of the valley is three hundred meters above sea level. Valley is surrounded by hills on the south side and by high mountains Karavanke Alps on the west, north and east side. There are two towns and several small villages in the valley and its surrounding where approximately 36000 people lived in the time when campaign had been performed (Elisei et al., 1991). Map on Figure 6 shows the location of Šaleška valley in the north-eastern part of Slovenia.
The experimental campaign had been performed by researchers from three research institutions: ENEL-CRAM and CISE, Milano, Italy and Jozef Stefan Institute, Ljubljana, Slovenia. Data obtained during the campaign had been used to validate several available air pollution models: standard and advanced Gaussian models, Gaussian puff model and Lagrangian particle dispersion model (Brusasca et al., 1992, Božnar et al., 1993, Božnar et al., 1994). Final results of this studies proved that the Lagrangian particle dispersion model is the most effective tool for air pollution modelling in very complex terrain. Campaign was described in details in a final report (Elisei et al., 1991) where also all measured data is available. Database consists of measurements from different measuring systems: automatic measuring stations of Environmental Information System (EIS) maintained by TPPŠ, automatic mobile laboratory, one mobile Doppler SODAR and DIAL. Pictures of some of equipment are presented on Figure 7.
Environmental Information System of TPPŠ consisted of six stationary automatic measuring stations and one mobile station. Locations of the stations are presented on Figure 5. Environmental parameters measured on stations are presented in Table 1.
TPPŠ had during the campaign three operating stacks of different heights: 100m, 150 m and 230 m. Neither of the stacks had installed desulphurization plant during the experimental campaign. Measured emissions are presented in Table 2 where static and dynamic parameters are given. Emissions from generators Block 1, Block 2 and Block 3 are emitted from one stack named Block 1,2,3. Picture of TPP Šoštanj is presented on Figure 7.
3.2. Air pollution modelling
In this study case the Lagrangian particle air pollution dispersion model has been chosen for validation. The name of the model is SPRAY and its detailed description is given in papers by its authors (Brusasca et al, 1992, Tinarelli et al., 2000).
Model has been chosen for validation due to several reasons:
First version of the model has already been validated on the “Soštanj91” field data set (Brusasca et al., 1992, Božnar et al., 1993, Božnar et al., 1994). Model has significantly evolved in last years. It has moved from research usage to usage for operational regulatory purposes (Tinarelli et al., 2000).
Model follows new Slovenian legislation where for complex terrain it is required to use Lagrangian particle dispersion model. Model is coupled with corresponding meteorological pre-processor module which is able to reconstruct three dimensional diagnostic non-divergent wind fields.
Most of Slovenian industry is located in the complex terrain at the bottom of valleys, river canyons or basins. The results of validation can therefore be applied also on the other similar cases in Slovenia or anywhere else where complex terrain is present.
Latest version has also been recently validated on “Soštanj91” field data set (Grašič, 2007). The validation results in this study are extended and focused on the validation method rather than on the validation of the model.
Detailed description of model parameters and settings for this study are described in paper about latest validation of the model (Grašič, 2007). For input into meteorological pre-processor measured data mobile Doppler SODAR and from automatic environmental measuring stations as described in Table 1 has been used. All measured data are available in half-hour intervals. Mobile Doppler SODAR has been located in the centre of the domain. Meteorological fields have been reconstructed at 150 m horizontal resolution. The same resolution has also been used to describe the complex topography (i.e. orography, Corine land use, etc.). Given all this topography and local meteorological data three dimensional mass consistent wind fields have been generated and used in Lagrangian particle dispersion model for air pollution reconstruction.
Lagrangian particle dispersion model has been generating half-hour average ground concentration fields at the same resolution 150 m as meteorological pre-processor. It has been using Thomson’s 1987 scheme with Gaussian random forcing (Thomson, 1987). The number of emitted virtual particles has been set in order to assure minimum resolution for ground level concentrations less than 1 μg/m3. Anfossi’s formulation (Anfossi, 1993) has been used for plume rise of hot stack plumes where horizontal and vertical variations of both mean wind and atmospheric stability had been taken into consideration.
Air pollution reconstruction has been made for the full duration of the experimental campaign: from 15th of March till 5th of April 1991. Results from simulation are available in half-hour intervals. Each half-hour result represents average air pollution situation over complete domain for one half-hour interval. This result is a three-dimensional (3D) concentration field describing concentrations for each cell of the domain. Domain consists of 100x100 grid cells in horizontal and of 20 layers in vertical that is 200000 grid cells in one 3D concentration field. For validation of the model only two-dimensional (2D) concentration field is relevant representing ground level concentrations. This ground-level concentration field consists of 100x100 cells from ground layer where each cell size is 150 m x 150 m in horizontal and 10 m in vertical.
For demonstration of new validation method only one very complex air pollution situation has been selected. It is a typical complex terrain situation, very difficult for reconstruction and still represents greatest challenge to all available air pollution dispersion models. The situation is described in details in paper by Grašič et al. (2007). It lasted from 1st of April 1991 at 20:00 until 2nd of April 1991 at 20:00.
Spreading of reconstructed plume in three-dimensional domain is presented on Figure 9 where it is shown that plume has been spreading in all directions over domain during a relatively short period of time. This is also seen from the Doppler SODAR measurements presented on Figure 10. This graph represents measurements from SODAR for each half-hour time interval at different heights. Each arrow on the graph represents direction of horizontal wind component at certain height. The length of the arrow represents the magnitude of horizontal wind speed component.
Air pollution spreading in all directions is also proven by measurements of half-hour average SO2 concentrations at four environmental stations at different directions from TPPŠ as presented on Figures 11, 12, 13, 14.
In the paper by Grašič et al. (2007) it is also reported that during this selected period the phenomenon of air pollution accumulation occurred. Very stable meteorological situation was main cause for very slow mixing of plume with air. Pollution plume was moving very slowly according to measured average wind speed and direction. At the beginning of this situation the air pollution from the point of view of a measuring station came from the direction of the source. But when the main wind changed its direction to opposite direction, also the air pollution cloud changed its direction. From now on from the point of view of measuring station it appeared that the air pollution cloud is coming from the virtual emission source located on the other side. In our case selected domain was not wide enough to capture this phenomenon by Lagrangian particle dispersion model. Part of the air pollution cloud has been lost out of domain which should be taken into account when model is being validated. Lagrangian particle model could reproduce this phenomenon correctly if the domain would be widened but in this case we would have to decrease the final resolution of the domain due to computational limits of the model. Decrease of the resolution (i.e. from 150 m grid cell to 500 m grid cell) would result in more coarse results and also some local complex terrain effects could be lost.
3.2. Validation results
Validation of modelling results is performed at four stations located in different directions from the point of view of thermal power plant Šoštanj. Four locations are selected according to positions of four environmental automatic measuring stations as presented on Figure 5: Graška Gora, Šoštanj, Veliki Vrh and Zavodnje. From all these stations measurements of half-hour average SO2 concentrations are available for selected air pollution situation from 1st of April 1991 at 20:00 till 2nd of April 1991 at 20:00.
As presented on Figure 11 measured SO2 concentration was increased due to wind change at the beginning of selected air pollution situation. Wind at approximate height 250 m changed its direction from north-west to south-east. Next wind change was toward the south which caused an increase of SO2 concentrations at Šoštanj (Figure 12) and Veliki Vrh (Figure 13) stations. Figure 14 presents measured SO2 concentrations at Zavodnje station which is the most distant station from the TPPŠ. This result is interesting especially because of the measured SO2 concentration peak at the ending of air pollution situation. This peak was caused by air pollution accumulation phenomenon as describe in previous sub-chapter about air pollution modelling. Because the station is located near the border of domain (Figure 5) it is expected that the model results will be underestimated in this case.
In the following sub-chapters a comparison between measured and reconstructed SO2 concentrations at the locations of presented four stations is made using traditional and enhanced validation methods. Within validation using traditional validation methodology modelling problems will be described that cause under or over estimations of reconstructed concentrations. And the sub-chapter using enhanced validation methodology is presenting different comparison results which can be used as a good estimation of model’s inaccuracy of position and time.
3.2.1. Traditional validation results
Figure 11 shows comparison between measured and reconstructed SO2 concentrations at station Graška Gora. Reconstructed concentrations agree very well with measured ones. Also comparison using traditional statistical indexes for complete duration of experimental measuring campaign from 15th of March till 5th of April 1991presented in Table 3 shows good correlation where correlation reaches value higher than 0.3.
Same comparison of SO2 concentrations at station Šoštanj is presented on Figure 12. Comparison on the graph shows underestimation of reconstructed concentration values. The first reconstructed peak at 11:30 hour is underestimated due to model’s inaccuracy of position. In the paper by Grašič et al. (2007) it is shown that correct peak has been reconstructed just two cells away from the station. The second underestimated concentration peak is caused by short distance between station and stacks (approximately 500 m). There are two effects that are not well captured due to this short distance. First is the stack tip down-wash effect. And the second is the combination of low-wind speed in direction towards the station and convective turbulences (Grašič et al., 2007). Comparison using traditional statistical indexes presented in Table 3 shows almost no correlation and medium underestimation of reconstructed concentrations
The comparison of SO2 concentrations at Veliki Vrh station are presented on Figure 13. During the air pollution situation two concentration peaks have been reconstructed (from 00:00 till 04:00 and from 06:00 till 12:00). Both peaks are not correctly reconstructed due to inaccuracy of the model in position. Such peaks can appear in real situation just few meters from the measuring station without being detected (Grašič et al., 2007). Comparison presented in Table 3 shows poor correlation between measured and reconstructed concentrations.
Even more obvious phenomenon of model’s inaccuracy in position is presented on Figure 14 where comparison between measured and reconstructed SO2 concentrations at the location of Zavodnje station is made. The phenomenon is more expressed because of the long distance between the station and thermal power plant. It generated first reconstructed peak in time interval from 00:00 till 04:00 hour. The second measured concentration peak has been underestimated due to air pollution accumulation that has been lost because the domain was not wide enough to capture the phenomenon. This event occurred at the end of air pollution situation when the wind changed direction from south back to north-west direction (Grašič et al., 2007). Comparison presented in Table 3 shows none correlation between measured and reconstructed concentrations and very high normalized mean square error.
Table 3 presents results of comparison between measured and reconstructed SO2 concentrations using statistical indexes for complete duration of experimental measuring campaign in spring of year 1991 from 15th of March till 5th of April 1991. In this case traditional point-to-point comparison has been made. From the results seems that only the reconstructed concentrations at Graška Gora are satisfactory comparing to results of many authors in published papers (Ferrero et al., 1993, Rizza et al., 1996, Kaasik, 2005) which were also participating in model validation framework named “Model evaluation toolkit” that has been established and maintained by Olesen (1996). Within this research Olesen aslo outlined difficulties that can arise in model validation: differences between measured and reconstructed concentrations are caused by measuring errors, inherent uncertainty, input uncertainty and model formulation error.
Detailed analysis of selected air pollution situation (Grašič et al., 2007) determined that inaccuracy in position and time of reconstructed concentrations have been caused mostly by model’s sensitivity to wind speeds and directions measured at different stations and by SODAR. Model’s sensitivity strongly depends on the complexity of the terrain which is in our case highly complex.
3.2.2. Enhanced validation results
Figure 15, 16, 17 and 18 shows comparison between measured and reconstructed SO2 concentrations at stations Graška Gora, Šoštanj, Velikih Vrh and Zavodnje. There are three types of reconstructed concentrations with different size of neighborhood as described in section 3.2 Enhanced validation methodology:
Recon. (ΔH=0,ΔT=0) - size of neighborhood is 0, only 1 cell where station is located is used for comparison, results are identical to traditional validation method
Recon. (ΔH=1,ΔT=1) - size of neighborhood is 27 cells (9 cells in horizontal scale and 3 cells in time scale)
Recon. (ΔH=2,ΔT=2) - size of neighborhood is 125 cells (9 cells in horizontal scale and 3 cells in time scale)
Agreement between measured and reconstructed concentrations is significantly improving when neighborhood is expanding. Similar result is obtained within comparison using traditional statistical indexes for complete duration of experimental measuring campaign from 15th of March till 5th of April 1991 presented in Tables 4 and 5. Comparison of results presented in Tables 3, 4 and 5 show significant improvement of all statistical indexes.
4. Further improvements of enhanced validation methodology
In the paper by Grašič et al. (2007) validation has been performed using enhanced validation methodology as explained in chapter 3.2 Enhanced validation methodology. For this validation neighborhood of 27 cells (one cell in each horizontal direction ΔH=1 and one time step on time scale ΔT=1) has been used. Figures 19, 20, 21 and 22 present comparisons of the results obtained in paper by Grašič et al. (2007) and results presented in previous sub-chapter 3.2.2 Enhanced validation results where also neighborhood of 27 cells has been used.
This comparison shows slightly better results for recent study than for the previous study. This is also apparent from statistical indexes presented in Table 6. Main difference between previous and recent study is in removing of used reconstructed concentrations for further comparison in the old method. Main idea of enhanced validation methodology is to assign each measured concentration one reconstructed concentration from the neighborhood. Focusing on the time scale this means that one the same reconstructed concentration can be assigned to three measured concentrations when size of neighborhood is one time interval ΔT=1. To avoid this in the previous study to each measured concentration only one unique reconstructed concentration has been assigned that appeared firstly in the set. The set was processed in time order from the oldest to youngest measured concentration. If processing had been performed in opposite order for the youngest to oldest the results would be slightly different. To find out the best order how to process this set very advanced optimizing algorithm performing in three dimensions would have to be developed which will be our main task in the future. Another task that will have to be solved in parallel will also be determination of appropriate criteria function to measure success of this optimization algorithm.
Traditional air pollution model validation methodology has been extended in this paper. It is based on statistical comparison between measured and reconstructed air pollution concentrations in the environment where different statistical indexes are determined. The method been upgraded to estimate inaccuracy in position and time of the Lagrangian particle air pollution dispersion model. To obtain these inaccuracies additional reconstructed air pollution concentrations from the neighborhood are used. Neighborhood is defined in spatial and time scale.
Enhanced validation methodology has been demonstrated on a field data set »Šoštanj91« from a very complex terrain from Šaleška region (Slovenia). Field data set is described in details and it has been selected mainly due to high emissions during experimental campaign where SO2 air pollution situation behaved as tracer experiment.
Air pollution modeling has been made using Lagrangian particle air pollution dispersion model SPRAY produced by ARIANET Srl from Milano, Italy. This model has been chosen for validation because it follows Slovenian legislation about air pollution modeling over complex terrain where most of Slovenian industry is located.
For validation of the model only one very complex air pollution situation has been selected. It is a typical complex terrain situation, very difficult for reconstruction and includes phenomenon of air pollution accumulation and convective mixing afterwards. Validation using standard statistical indexes has been made at four locations in different directions from the point of view of air pollution source.
Validation begins using traditional validation methodology. Comparison between measured and reconstructed SO2 concentrations gives relatively poor results. Only reconstructed concentrations at one station are satisfactory. It has been determined that these results are caused by model’s sensitivity to measured wind speeds and directions.
To “measure” this model’s inaccuracies in position and time enhanced validation methodology is demonstrated. It gives more satisfactory results at location of all stations and it also estimates inaccuracies. It has been estimated that model’s inaccuracy in position is about +-300 m and in time +-1 hour which is indeed excellent result for such a complex terrain. These results give good information for future improvement of air pollution dispersion model.
On the other hand also inaccuracies of measurements should be taken into account during the validation process. It is very important to be aware that the measurements are made at certain location. In certain meteorological conditions these measurements are not representative for the nearest neighborhood. This effect is even more obvious for the complex terrain where the air pollution plume can be present in the nearest neighborhood of the station but it is not detected due to certain local phenomenon.
The study was partially financed by the Slovenian Research Agency, Project No. L1-2082.