InTech uses cookies to offer you the best online experience. By continuing to use our site, you agree to our Privacy Policy.

Environmental Sciences » "Current Air Quality Issues", book edited by Farhad Nejadkoorki, ISBN 978-953-51-2180-0, Published: October 21, 2015 under CC BY 3.0 license. © The Author(s).

Chapter 4

Air Pollution Monitoring and Prediction

By Sheikh Saeed Ahmad, Rabail Urooj and Muhammad Nawaz
DOI: 10.5772/59678

Article top


Model of a neuron
Figure 1. Model of a neuron
Base map
Figure 2. Base map
Dataset errors for the NO2 dataset
Figure 3. Dataset errors for the NO2 dataset
Graph of correlation for NO2
Figure 4. Graph of correlation for NO2
Network errors for NO2
Figure 5. Network errors for NO2
Error distribution for NO2
Figure 6. Error distribution for NO2
Spatial distribution of NO2 concentration
Figure 7. Spatial distribution of NO2 concentration
The top five networks explored by heuristic search approach
Figure 8. The top five networks explored by heuristic search approach
Network explored by heuristic search
Figure 9. Network explored by heuristic search
Real vs. network output
Figure 10. Real vs. network output
Scattered plot of real and output network values
Figure 11. Scattered plot of real and output network values
Graph of error dependence
Figure 12. Graph of error dependence
Excel sheet presenting manual query
Figure 13. Excel sheet presenting manual query
Relationship of rainfall, temperature, and humidity with NO2 concentration (November 2009–March 2011)
Figure 14. Relationship of rainfall, temperature, and humidity with NO2 concentration (November 2009–March 2011)
NO2 concentration in summer
Figure 15. NO2 concentration in summer
NO2 concentration in winter
Figure 16. NO2 concentration in winter
NO2 concentration in spring
Figure 17. NO2 concentration in spring
NO2 concentration in autumn
Figure 18. NO2 concentration in autumn
Seasonal variation in NO2 concentration levels (November 2009 – March 2011)
Figure 19. Seasonal variation in NO2 concentration levels (November 2009 – March 2011)

Air Pollution Monitoring and Prediction

Sheikh Saeed Ahmad1, 2, Rabail Urooj1, 2 and Muhammad Nawaz1, 2

1. Introduction

One of the most important emerging environmental issues in Asian cities is air pollution. Air pollution is an atmospheric condition in which the concentration and duration of certain substances present in the air produce injurious and destructive effects on both man and the surrounding environment [1]. The most common pollutants in air are sulfur oxide, nitrogen dioxide, carbon monoxide and dioxide, and particulate matter.

Geographical Information Systems (GISs) are computer-based applications used for mapping and analyzing the earth and related spatially distributed phenomena. GIS applications integrate unique visualizations with common databases, which make it possible to capture, model, manipulate, retrieve, analyze, and present the geographically referenced data. Compared to other information systems, GIS systems have advantages, including the high power of analyzing spatial data and handling large spatial databases.

GIS applications can be used in air quality management and for controlling pollution, for handling and managing large amount of data. GIS systems manage spatial and statistical data, which facilitates depiction of the association between the frequency of human activities leading to bad environmental health and poor air quality. GIS modeling and statistical analysis also enables to examine and predict the impact of climatic variables on air pollution. In this way, GIS systems help in monitoring air pollution and emissions of pollutants from different sources.

Air pollution mapping is a helpful method for determining the concentration of pollutants. As the result of air pollution mapping, overviews of pollution in cities can be created and their sources of pollution emission can be identified, which help in controlling emissions. Different studies have been executed on air pollution in conjunction with GIS [2-11]. Consequently, GIS applications in air monitoring are necessary to determine air quality to reduce pollution to such a level at which harmful impacts on human health and the environment is reduced.

With the help of GIS applications, an output report of pollutants in Air Quality Management Systems (AQMSs) can be achieved in the form of three-dimensional (spatial) records. In AQMS emission time, concentration and place of air pollutants are regulated in order to achieve the predefined air quality standards of ambient air. It encompasses the estimation of the pollutants’ emission schedule in a way to determine the consequences to air quality and the design of alternative programs for emission control in order to meet air quality standards, which are subject to some limitations, for example, technological viability and lowest charges. For environmental modeling with GIS applications, AQMSs are considered to locate monitoring stations, for development of geospatial model for air quality, and for spatial decision-support systems. However, the most significant step in an AQMS is data mining. The data mining method is a skill, which is used to analyze the data, uncover hidden patterns, and find interesting information from large amounts of data or huge databases. The most commonly used technique in data mining is artificial neural networks [12].

The human brain consists of a large number of neurons connected to each other by synapses to make networks, and these networks of neurons are called neural networks, or natural neural networks. Similarly, the artificial neural network (ANN) is basically a mathematical model of a natural neural network. The ANN uses a mathematical or computational model based on connectionist approach for solving the given problem. The concept of ANN is derived from biological neural network systems. The key applications of neural networks are control systems, classification systems, and prediction and vision systems.

Three basic components are important in order to make functional model, like: synapses of neuron; an added that sum all input in form of weights; and activation function. In Figure 1, synapses are shown by weights. Basically, a strong connection between input and neuron is noted by synapses or value of weight. Negative values reflect inhibitory connections, whereas excitatory connections are shown by positive values. Activation functions regulate the output of neurons within an acceptable range from -1 to 1.


Figure 1.

Model of a neuron

1.1. Sources

Air pollution takes place due to natural and anthropogenic activities. But air pollution as the result of man-made activities like fossil fuel combustion, construction, mining, agriculture, and warfare are the most significant and cause problems in the atmosphere [13].

Basically, two types of pollution sources have been categorized, i.e., Stationary and Mobile. The stationary source is a type of source that is fixed or is a preset pollutant emitter, for example, fossil fuel burning power plants and refineries. The mobile source is a nonstationary type of pollutant emitter, for example, vehicles. The most emerging and leading cause of air pollution is the motor vehicle [14]. Pollutants that are emitted directly from the source into the air are known as primary pollutants, for example, carbon dioxide, carbon monoxide, sulfur dioxide, etc. When these primary pollutants react in atmosphere with each other to form another type of pollutants, they are called secondary pollutants, which are not directly emitted but formed as a result of primary pollutants’ reaction in the atmosphere. For example, ozone forms when nitrogen oxides react with hydrocarbons in the presence of sunlight, and the resulting nitrogen dioxide reacts further with oxygen and forms ozone as pollutant.

1.2. Health effects

Air pollution and its effects in rural and urban areas are directly related to the ongoing activities. For example, in cities, pollution is related to the products of combustion in industries and vehicles. Many large cities all over the world exhibit excessive levels of air pollutants. Among all dangerous pollutants, nitrogen dioxide (NO2) is important due to its capacity of causing dangerous effects on humans and the environment, which results in photochemical oxidation and acid rain.

The effects of air pollution cannot be ignored even within homes. Many air pollutants can cause cancer and other diseases among inhabitants. In 1985, it was reported that indoor toxic chemicals are three times more potent in causing cancer than outdoor air pollutants [15]. In America, health issues caused by buildings are called "sick building syndrome"[16].

1.3. Case study

In Pakistan, air pollution is emerging as a serious problem in its mega cities, which needs to be monitored and addressed at the root level in order to reduce the lethal impacts of pollutants on man and environmental health. The present study of Pakistan focuses on the most important twin cities of Pakistan, which are Rawalpindi and Islamabad. Both cities are commonly viewed as one unit and are 15 km apart. The study area with 135 sampling locations is shown in Figure 2. The climatic condition of Rawalpindi and Islamabad is sub-humid to tropical, with hot and long summers (May to August) accompanied by a monsoon season (July to August) followed by short and mild winters (October to March). The average low temperature is 12.05 °C in January and average high temperature is 31.13 °C in July.


Figure 2.

Base map

For the monitoring campaign, the maximum area (135 sampling sites) was covered in order to represent different traffic intensity and congestion levels in the urban area of Rawalpindi and Islamabad, for sampling. These sites included dual carriageways, major, linking, and small roads, healthcare centers, educational institutes, commercial areas, old residential areas, modern residential areas, recreational spots and semi-rural areas.

Research was carried out in order to monitor the NO2 concentration in the ambient air of Rawalpindi city. Passive samplers were used within the city from January to December in 2008. The average concentration found was 27.46±0.32 ppb. The highest concentration was recorded near the main roads and in the vicinity of schools and colleges due to the large number of transport vehicles, which exceeded the set limit concentration value given by the World Health Organization.

2. Experimental design

2.1. Passive Sampling of NO2

The most frequent method in monitoring studies for passive sampling of NO2 is using diffusion tubes described by Atkins [17]. This method for NO2 measurement is reliable, easy to handle, and it is an inexpensive method for screening air quality. Moreover, passive samplers are preferably appropriate for extensive spatial measurement of NO2, and they have been reported in many studies of NO2 monitoring of air in many countries like the United Kingdom, USA, France, Turkey, Argentina, and China [18].

Basically, passive samplers are designed on the principle of air diffusion having an efficient absorber at one end of the tube, and the flow rate (sampling rate) at constant temperature can be measured by using Flick’s Law [19]. For that, the length and diameter of diffusion tubes are known, whereas sampling by using diffusion tubes is independent of air pressure.

2.2. Neural network design

From different sampling sites covering the whole study area, data was collected for neural network analysis. Collected data was fed to the neural network that has area_id, season_id, temperature, humidity, rainfall, and the respective concentrations as columns. For the neural network, the marked value was set to predict concentrations and rests were used as input to the neural network.

Neural network has two phases: training and testing. In the first phase (training), the network is trained by providing the complete information about the characteristics of data and observable outcomes to perform a particular task.

A neural network can develop a model that learns the relationship between input data and the desired outcome in the training phase. In the testing phase, testing data are provided as input. The performance of the testing phase depends upon the training phase (it depends on the number of samples that are provided during the training phase and also on the number of times that the network is accurately trained. However, it is impossible that the output is 100% precise for any network input. MS Access was used as the database engine because it is easy to use for all.

For testing the neural network, the cross validation method is used by using holdout method in which data was divided into testing and training data. The database consisted of two tables: training_ data and testing_data. The function of training_data is to train the ANN by adjusting weights in order to maximize the predictive ability of ANN and minimize error during forecasting. Testing data was used to test the prediction accuracy of ANN on new data. The structure of training data and testing data is given in Table 1.

In Table 1, the first key “id” is primary key, which contains the number that indicates row number and the second key “loc_id” contains the number that indicates location from where data is gathered, loc_name indicates the name of location and the next six fields indicate position of location with respect to north and east. The next two indicate temperature and humidity levels.

The 13th and 14th fields indicate concentration of NO2 and level of concentration value. The last field of dataset contains week number, which indicates the number of weeks in which data is gathered from particular location. The attribute for testing data are the same in the testing data structure.

Field Name Data type Primary key Field size
IdNumberYesLong Integer
loc_idNumberLong Integer
map_idNumberLong Integer
north_dNumberLong Integer
north_mNumberLong Integer
north_sNumberLong Integer
east_dNumberLong Integer
east_mNumberLong Integer
east-sNumberLong Integer
ConcentrationNumberLong Integer
con_levelNumberLong Integer
Week NumberLong Integer

Table 1.

Structure of training data

For designing a network, we need to specify the architecture of a neural network by designing a number of hidden layers and units in each layer along properties of network that describe error function and network activation.

For optimal generalization of collected data, two types of architectures: the rtNEAT (real-time neuro evolution of augmented topologies) architecture with evolution algorithm and the feed forward architecture with back propagation algorithm of ANN are used in order to ensure high accuracy of ANN prediction about impacts of NO2 concentration achieved in future. This rtNEAT architecture is used to train neural network with evolutionary algorithm, which has three steps, i.e., selection, mutation, and reinsertion. But before the training of neural network, the topology has to be created in the design of the neural network. A neural network is a connection of neurons, which contains three types of nodes: input, output, and hidden node. All nodes are randomly created during its execution.

Table 2 describes the properties of network, which contains an error function and network activation parameters. These properties are functional to all tested networks by the architecture search method and manually selected network.

Parameter Value
Input activation FXLogistic
Output nameConcentration
Output error FXSum-of-squares
Output activation FXLogistic

Table 2.

Network properties

The logistics function has a sigmoid curve and sum of squares. The sum of squares is the most frequent function error, which is used for the classification problem. The error is the sum of the square differences between the real input value and neural network target value.

2.3. Architecture search

A heuristic search is used to search the dataset for the best networks. Heuristic methods are used to speed up the process of finding a satisfactory solution. The architecture search for the designed neural network NO2 is given in Table 3.

ID Architecture # of Weights Fitness Train
AIC Correlation R-Squared

Table 3.

Heuristic architecture search for NO2

2.4. Training of neural network

The next step is to train the neural network for the NO2 dataset by using the propagation algorithm. Weight change is calculated by the quick propagation algorithm by utilizing the quadratic function f(x) = x2. In neural networks, several layers contain neurons in each layer that are connected with each other like neurons in the input layer connected to one or more neurons of the hidden layer, which are further connected to the output layer’s neuron. With each presentation in neural network, error is computed as the difference between network output and observable output. The combination of randomly assigned weight (giving low error) replaces weights that are at the first location. This is called training to adjust the connection weights to enable the network to produce the expected output. Two different weights having two different error values are two points of a secant. Relating this secant to a quadratic function, it is possible to calculate its minimum f'(x) = 0. The x-coordinate of the minimum point is the new weight value.

S(t)=Ewi(t)Δwi(t)=αEwi(t)(Normal back propagation)Δwi(t)α=Ewi(t)S(t)=Ewi(t)=Δwi(t)αΔwi(t)=S(t)S(t1)S(t)Δwi(t1)(Quick propagation)

Here w =weight, i =neuron, E =error function, t =time (training step), α= learning rate, and μ= maximal weight change factor

The quick propagation coefficient was set to 1.75, learning rate was 0.1, and iterations were 500. The training graph for dataset errors for NO2 is shown in Figure 3.


Figure 3.

Dataset errors for the NO2 dataset

The training graph of correlation for NO2 is shown in Figure 4.


Figure 4.

Graph of correlation for NO2

The graph of error improvement – network errors for NO2 is shown in Figure 5.


Figure 5.

Network errors for NO2

The error distribution of network statistics obtained after training of neural network is shown in (Figure 6).


Figure 6.

Error distribution for NO2

3. Data analysis

In order to determine the seasonal variation and statistical significance, results are presented in tabular format. Tables 4 a and 4 b show the average concentration level of NO2, season-wise, along standard deviation (SD) values measured at different sampling sites of study.

Table 4 a shows average values of NO2 concentration in different seasons of 12 major sampling categories in urban Rawalpindi and Islamabad from November 2009 to July 2010.

Table 4 b shows the seasonal average concentration of NO2 of 12 major sampling categories in urban Rawalpindi and Islamabad from September 2010 to March 2011.

Table 5 presents NO2 concentration for each selected category, as described in study area profile, to understand the general trends of NO2 concentration levels among different categories during the course of experimental period.

Sampling Categories Mild Winter
Winter (Dec to Jan) Early Spring (Feb) Spring (Mar) Mild Summer (April) Summer (Pre-Monsoon) (May to June) Monsoon (July to August)
NO2 Conc.
(weekly basis)
(ppb) (ppb) (ppb) (ppb) (ppb) (ppb) (ppb)
Rawalpindi Dual Carriage Ways (5) 87±19.7898±26.8763±12.2953±6.4944±10.6422±4.2218±1.91
Major Roads (10) 60±12.1968±9.5652±13.5245±10.2336±8.9726±5.8819±4.74
Sub-roads (6) 74±20.5086±24.4760±16.4950±11.0538±12.6533±13.0121±4.39
Small Roads (3) 55±9.7863±4.8947±5.5740±8.2431±3.4025±4.6818±4.81
Public Hospital (5) 48±18.7163±18.4037±0.7429±2.2922±2.2418±0.7914±0.96
Private Hospitals (8) 61±14.4775±14.1938±1.1632±2.0.325±2.2920±5.5714±3.98
Public EI (11) 85±30.5895±32.9475±23.7563±17.9447±17.3731±10.1420±1.94
Private EI (17) 55±9.7166±9.5445±4.5643±9.6538±10.8926±4.5418±3.18
Old Residential Areas (5) 83±15.2495±16.0955±13.3251±6.6637±6.4426±2.5419±1.05
Modern Residential Areas (5) 65±20.0773±14.8969±24.4959±12.5536±7.1328±5.0821±2.61
Commercial Area (2) 75±0.8382±1761±6.6951±7.1136±4.2921±6.2018±4.78
Bus Stops (9) 74±20.2683±31.4769±33.7858±1739±17.3228±8.4120±5.25
Recreational Spots (9) 75±38.4087±40.7662±36.3956±21.8843±19.9731±11.1219±2.37
Islamabad Dual Carriage Ways (3) 84±28.7395±33.6466±23.7857±12.3145±16.6924±5.9819±4.16
Major Roads (3) 50±3.7260±2.0440±0.8132 ±226±4.4221±2.9715±2.16
Sub-roads (4) 54±6.0667±6.3949±6.4943±7.2438±12.7925±1.3818±1.26
Small Roads (3) 59±12.6564±6.3351±9.6044±8.9335±4.5326±3.6620±3.08
Public Hospitals (3) 44±0.5857±0.2939±0.2932±0.5823±1.4719±0.5115±1.71
Private Hospitals (1) 42563830241914
Public EI (5) 53±13.3464±9.3246±9.3039±10.7634±14.1925±6.0118±1.28
Private EI (6) 58±11.2363±7.1849±7.7239±9.9331±5.8524±2.6617±1.77
Commercial Area (1) 61685750352516
Bus Stops (12) 72±14.2578±16.2365±7.5155±5.2334±6.2225±3.2119±2.56
Recreational Spots (2) 62±5.9769±4.5857±2.4548±1.5938±2.4825±3.1517±1.56
Semi-Rural Areas (7) 46±8.9859±5.6442±6.4133±7.8731±5.2924±3.2218±3.19
Rawalpindi Dual Carriage Ways (5) 30±4.0051±7.5288±22.28100±26.4263±18.5250±20.32
Major Roads (10) 27±6.5749±9.7261±10.3368±9.3448±3.7637±3.79
Sub-roads (6) 32±7.3153±12.9774±23.4287±26.6058±13.6939±6.36
Small Roads (3) 28±3.6537±2.5353±6.3062±2.5854±12.3943±11.13
Public Hospital (5) 20±0.9832±5.4648±18.0164±18.0340±6.1329±3.94
Private Hospitals (8) 23±3.9038±7.1960±15.3773±14.0340±4.0531±1.95
Public EI (11) 44±16.9881±36.8786±31.7896±34.2073±21.2663±18.23
Private EI (17) 31±4.8542±6.1055±8.9166±9.8245±7.0835±5.90
Islamabad Dual Carriage Ways (3) 31±5.7250±11.4082±21.1199±32.7067±19.7849±12.84
Major Roads (3) 22±0.8037±1.9353±4.3365±0.3044±2.3033±2.00
Sub-Roads (4) 26±2.4839±4.6054±5.7465±4.0846±3.0235±9.17
Small Roads (3) 30±5.9441±4.1254±4.2463±6.6747±2.6038±2.79
Public Hospitals (3) 22±2.1434±2.6645±0.8060±1.4145±0.2234±0.82
Private Hospitals (1) 223140553830
Public EI (5) 31±7.4140±3.6052±7.9464±8.3946±9.3437±9.62
Private EI (6) 29±11.5841±8.6554±10.1463±7.0347±7.9336±9.17
Twin Cities Old Residential Areas (5) 27±2.9761±14.7484±14.1895±16.5158±12.4148±10.06
Modern Residential Areas (5) 32±7.8649±11.7066±20.0775±16.1660±19.1648±16.53
Commercial Area (3) 32±1.2346±6.0963±1.0071±3.5756±7.0248±8.41
Bus Stops (11) 32±9.1153±20.3076±20.0787±32.4069±31.3454±19.54
Recreational Spots (10) 37±18.5552±25.2371±37.6384±39.8357±29.7146±24.78
Semi-Rural Areas (7) 31±9.4741±7.4453±6.5162±6.2144±7.5036±6.99

Table 4.

(a): Seasonal mean values of NO2 from November 2009 to July 2010 (b): Seasonal mean values of NO2 from September 2010 to March 2011

Sampling Categories No. of Sites Average NO2 Conc. (ppb)
Dual Carriage Ways 855.23
Major Roads 1353.56
Sub-roads 1051.78
Bus Stops 1151.62
Educational Institutions3951.26
Recreational Spots 1051.18
Old Residential Area 548.97
Small Roads 648.23
Commercial Area347.59
Hospitals 1747.44
Modern Residential Area 546.25
Semi-Rural Area 737.65

Table 5.

Average NO2 concentration levels in twin cities from November 2009 to March 2011

In Table 5 most of the sampling sites of study area showed nearly similar average concentration from month of November 2009 to March 2011. Maximum concentration of NO2 shown on dual carriage ways.

The possible cause of such elevated levels of NO2 concentration is extensive increase in number of vehicles, increase in population, busy roads, fuel inefficient vehicles, driving ways, and traffic jams. Gilbert reported that NO2 is considerably related to both the distance from the nearest highway and the traffic count on the nearest highway [20].

The rest of the categories showed nearly the same average concentration. Major roads and sub-roads showed average NO2 concentration levels of 53.56 ppb and 51.78 ppb, respectively. Sub-roads, bus stops, recreational spots, and educational institutions showed similar concentration levels of approx. 51 ppb.

Educational institutions and recreational spots, being present close to the dual carriage ways, also experience elevated concentration levels. Old residential areas (48.97 ppb) showed slightly higher NO2 concentration levels as compared to modern residential areas (47.59 ppb).

Narrow road, enclosing architecture, and congestion among the old residential areas result in traffic emission being trapped and buildup leading to higher NO2 concentration levels, whereas in modern residential areas increased vehicular number is the major cause of elevated NO2 levels. The minimum NO2 concentration levels were indicated in semi-rural areas, that is 37.65 ppb. A study in Vilnius commented the same phenomena; NO2 average rates depend upon traffic and are highest in cross roads and lowest at the background suburban areas [21].

For annual average concentration level of nitrogen dioxide, a spatial interpolation map has been developed by using inverse distance weighted (IDW). IDW in Figure 7 is clearly depicted as the areas of higher and lower concentration level of NO2 in Rawalpindi and Islamabad.

Higher concentration levels are represented by darker shades while the lower concentration levels are shown with lighter shades. The maximum NO2 values were found at the center of the city, where they reached the concentration of 83–110 ppb. Values were low on the outskirts of the city, with the lowest concentration in north (31–44 ppb).

A study in Vilnius commented the same phenomena; NO2 average rates depend upon traffic and are highest in cross roads and lowest at the background suburban areas. Dual carriage ways, sub roads, major roads, commercial areas, old residential areas, and areas where schools and colleges are existing have higher concentration levels of NO2. Intense traffic flow and congestion were the major reasons for these elevated levels of nitrogen dioxide concentration in those areas as vehicular emission is the predominant source of NO2.

Vehicle growth rate in twin cities is extensively high. Load of traffic is continuously increasing with growing population rate and demand of motor producing industry. Due to this, traffic congestion is also increasing day by day with growing vehicle population, resulting in highest emission rates per vehicle.

The higher emission rate of NO2 can also be attributed to the type of fuel and quality of fuel [22]. In Figure 7 Rawalpindi showed more concentration levels than Islamabad due their building patterns.


Figure 7.

Spatial distribution of NO2 concentration

3.1. Neural network data analysis

Based on the design of neural network, with the neural architecture and properties discussed, the data space is searched by using heuristic search method with 500 iterations and fitness criteria is set to Inverse Test error. The best top 5 networks explored from the space by the heuristic search are graphically shown (Figure 8).

Heuristic search is a problem-solving method that analytically searches a space of problem states. The best network is obtained when the absolute error gets minimum in the initial iterations so the best network out of the 5 best networks is shown (Figure 9).

Results for all data sets produced after training and testing data. Real vs. target graph represented a line graph of real- and network-predicted target values for record displayed in Table 6. X-axis shows the selected input column values and Y-axis represents network-predicted output values. Table 6 presents the summary of the real vs. output table after training.


Figure 8.

The top five networks explored by heuristic search approach


Figure 9.

Network explored by heuristic search

Target Output AE ARE

Table 6.

Summary of real vs. target

[i] - Correlation, 0.653989; R-squared, -0.290243

The visualization for real vs. output with row number on x-axis and target/output (area_id) on y-axis is shown (Figure 10).


Figure 10.

Real vs. network output

Figure 11 shows a scatter plot of the real and forecasted output values. X- axis presents the real values and Y-axis shows predicted network values.

Graph in Figure 12 shows the Network Error Dependence on values, which are numerically input in columns of data sheet. Through graph of Error Dependence, the ranges of the selected input column that can produce network error can be identified.

The last phase after the neural network is trained and tested is to query the network. The concentration is the output value for the neural network. So the input queries are subjected area_id, season_id, temperature, relative humidity, and rainfall (Figure 13).


Figure 11.

Scattered plot of real and output network values


Figure 12.

Graph of error dependence

The input Excel sheets are prepared for the GIS mapping. Sheets include area_id, their latitude, longitude, and their concentrations. With the help of interpolation, maps are created for the service.


Figure 13.

Excel sheet presenting manual query

Temporal variation can be explained through meteorological recorded conditions. However, most of the variations on a local scale are due to the impact of air pollutants.


Figure 14.

Relationship of rainfall, temperature, and humidity with NO2 concentration (November 2009–March 2011)

Figure 14 indicates the positive association of NO2 concentration level with humidity (RH in %) and negative association with the temperature. Figure 15 shows the concentration of NO2 during summer when recorded temperature, rainfall, humidity are 310C, 67, and 17mm, respectively.

Figure 16 shows the concentration of NO2 during the winter season at 11 0C, 68% humidity, and 9mm rainfall.


Figure 15.

NO2 concentration in summer


Figure 16.

NO2 concentration in winter

Concentration of NO2 during the spring season, shown in Figure 17, when recorded temperature is 35°C, humidity is 58%, and rainfall is 60 mm.


Figure 17.

NO2 concentration in spring

Figure 18 shows predicted concentration of NO2in autumn season when recorded temperature, humidity, and rainfall are 29 °C, 69, and 22 mm, respectively.


Figure 18.

NO2 concentration in autumn


Figure 19.

Seasonal variation in NO2 concentration levels (November 2009 – March 2011)

Figure 19 shows that concentration of NO2 varies in different seasons. The months from May to August were months in which the minimum value of NO2 was recorded, and the maximum concentration was measured in the winter season from December to January.

4. Conclusion

NO2 concentration levels were recorded on hourly and weekly basis in Rawalpindi and Islamabad city by using diffusion tubes. Artificial neural networks were trained to generalize the process of air pollutant spread over three dimensions. Prediction capabilities of ANN were analyzed through generalization by using hold-out evaluation method of classification. Results showed the advantage of using rtNEAT-like architecture of ANN where a neural network can modify its architecture to reduce the error up to the maximum possible limit. Results showed that annual average concentration of NO2 concentration was 44 ± 6 ppb. However, the highest concentration was recorded in winter season near the dual carriage ways, schools, and colleges because of the higher number of transport vehicles on the road. This endorsed the fact that the reduced photolysis leads to the accumulation of NO2 during winter due to less solar radiation. This is again attributed by the results of correlation, which reveal the negative correlation of nitrogen dioxide concentration levels with rainfall and temperature and the positive correlation with humidity. Moreover, the results of correlation reveal that the measured NO2 concentration levels at different sampling areas exceeded the set limit of concentration value of the World Health Organization and Pak-EPA standard policy. This type of investigative study of artificial neural networks in the area of air pollution modeling shows promising applications for advanced machine learning algorithms in the emerging area of research called eco-informatics.


1 - Mulaku C. Mapping and analysis of air pollution in Nairobi, Kenya. International conference on spatial information for sustainable development, Kenya, 2001.
2 - Gualtieri G and Tartaglia M. Predicting urban traffic air pollution: a GIS framework. Transportation Research – D; 1998; 3(5): 329–336.
3 - Pummakarnchana O, Tripathi N and Dutta J. Air pollution monitoring and GIS modeling: a new use of nanotechnology based solid state gas sensors. Science and Technology of Advanced Materials 2005; 6: 251–255.
4 - Afshar H and Delavar MR. GIS-based air pollution modeling in Tehran. Environmental Informatics 2007; 5: 557–566.
5 - Barnes J, Parsons B and Salter L. GIS Mapping of nitrogen dioxide diffusion tube monitoring in Cornwall, UK. Air Pollution 2005; 13: 157–166.
6 - Elbir T, Mangir N, Kara M, Simsir S, Eren T and Ozdemir S. Development of a GIS-based decision support system for urban air quality management in city of Istanbul. Atmospheric Environment 2010; 44: 441–454.
7 - Veen AVD, Briggs DJ, Collins S, Elliott S, Fischer P, Kingham S, Lebret E, Pryl K, Reeuwijk HV and Smallbone K. Mapping urban air pollution using GIS: a regression-based approach. International Journal of Geographical Information Science 2010; 11(7): 699–718.
8 - Vienneau D, de Hoogh K and Briggs D. A GIS-based method for modeling air pollution exposures across Europe. Science of the Total Environment 2009; 408: 255–266.
9 - Banja M, Como E, Murtaj B and Zotaj A. Mapping air pollution in urban Tirana area using GIS. International Conference SDI, Skopje, 15–17 September 2010, 105–114.
10 - Jensen SS. Mapping human exposure to traffic air pollution using GIS. Journal of Hazardous Material 1998; 61(3): 385–392.
11 - Kim JJ, Smorodinsky S, Lipsett M, Singer BC, Hodgson AT and Ostro B. Traffic-related air pollution near busy roads. American Journal of Respiratory and Critical Care Medicine 2004; 170(5): 520–526.
12 - Alexander SM. Data mining 2005. hhp//, FOR/course.mat/Alex/ (Accessed 12th February 2008).
13 - United Nation. Conference on the Human Environment, Sewden, 1972.
14 - United Nation. Environmental Performance Annual Report, New York, 2001.
15 - US-EPA. Health affects of different air quality index (AQI) levels caused by nitrogen dioxide, 2008.
16 - Miller D. Potential hazards of future volcanic eruptions. California, 1989.
17 - Atkins DHF, Sandallas J, Law DV, Hough AM and Stevenson K. The measurement of nitrogen dioxide in the outdoor environment using passive diffusion tube samplers. AEA Technology, 1986.
18 - Varshney CK and Singh AP. Passive samplers for NOx monitoring: a critical review. The Environmentalist 2003; 23: 127–136.
19 - Palmes ED, Gunnison AF, Dimattio J, and Tomczyk C. Personal sampler for nitrogen dioxide. American Industrial Hygiene Association Journal 1976; 37: 570–577.
20 - Gilbert NL, Goldberg MS, Beckerman B, Brook JR and Jerrett M. Assessing spatial variability of ambient nitrogen dioxide in Montreal, Canada, with a land-use regression model. Journal of the Air & Waste Management Association 2005; 65: 1059–1063.
21 - Lozano A, Usero J, Vanderlinden E, Raez J, Contreras J, Navarrete B and Bakouri HEI. Air quality monitoring network design to control nitrogen dioxide and ozone applied in Granada Spain. Ozone: Science & Engineering 2011; 33(1): 80–89.
22 - Heywood JB. Internal combustion engine fundamentals. McGraw-Hill, New York, 1998.