Open access

Time Series Analysis of Surface Ozone Monitoring Records in Kemaman, Malaysia

Written By

Marzuki Ismail, Azrin Suroto and Nurul Ain Ismail

Submitted: 01 December 2011 Published: 22 August 2012

DOI: 10.5772/50033

From the Edited Volume

Air Pollution - A Comprehensive Perspective

Edited by Budi Haryanto

Chapter metrics overview

2,487 Chapter Downloads

View Full Metrics

1. Introduction

Tropospheric ozone is known as environmental air pollutants that arise from photochemical reaction among various natural and anthropogenic precursors that are volatile organic compounds (VOCs) and organic nitrogen (NOx). Accumulation of the ozone may highly happen under favorable meteorological conditions and will have an adverse effect on human health and ecosystem [1]. Chan & Chan, 2001 concluded that people in Asia also cannot escape from the adversely impact ozone pollution as there were elevated ozone level being detected. Nevertheless, the long-term ozone trend has been less researched, especially in Malaysia.

The time series analysis is one of the best tools in understanding cause and effect relationship of environmental pollution [3, 4, 5]. Its applications in many studies were done to describe the past movement of particular variable with respect to time. However, there were several different techniques applied by researcher so that the change of air pollution behavior through time period can be determined [6, 7]. A study by Kuang-Jung Hsu, 2003 was done by using autoregression variation (VAR) in order to establish interdependence between primary and secondary air pollutants in area of Taipei. Besides, Omidravi et al., 2008 had applied the time series analysis in their investigation in order to find the answer that relate to extreme high ozone concentrations for each season in Ishafan by using Fast Fourier Transform. Therefore, this study aims to determine qualitative and quantitative aspect of the tropospheric ozone concentrations so that prediction on future concentration of the anthropogenic air pollutant can be achieved in the study area, i.e. Kemaman, Malaysia.

Advertisement

2. Material and method

This study was conducted in Kemaman (04°12'N, 103°18'E), a developing Malaysian town located in between the industrializing of Kertih Petrochemical Industrial Area in the north and industrializing and urbanizing of Gabeng Industrial Area in the South (Figure 1). In this area, there are dominant sources of ozone precursors related to industrial activities and road traffic.

Figure 1.

Locations of air monitoring station in Kemaman

In this study, ozone trend was examined using ozone data consisting of 144 monthly observations from January 1996 to December 2007 acquired from the Air Quality Division of ASMA for Sekolah Rendah Bukit Kuang station located in Kemaman district; one of the earliest operational stations in Malaysia. The monitoring network was installed, operated and maintained by Alam Sekitar Malaysia Sdn. Bhd. (ASMA) under concession by the Department of Environment Malaysia [10]. Tropospheric ozone concentrations data was recorded using a system based on the Beer-Lambert law for measuring low ranges of ozone in ambient air manufactured by Teledyne Technologies Incorporated (Model 400E). A 254 nm UV light signal is passed through the sample cell where it is absorbed in proportion to the amount of ozone present. Every three seconds, a switching valve alternates measurement between the sample stream and a sample that has been scrubbed of ozone. The result is a true, stable ozone measurement [11].

Time series analysis was implemented using STATGRAPHICS® statistical software package. A time series consists of a set of sequential numeric data taken at equally spaced intervals, usually over a period of time or space. This study provides statistical models for two time series methods: trend analysis and seasonal component which are both in time scale.

The seasonal decomposition was used to decompose the seasonal series into a seasonal component, a combined trend and cycle component, and a short-term variation component, i.e,

Ot = Tt x St x It

where Ot is the original ozone time series, Tt is the long term trend component, St is the seasonal variation, and It is the short-term variation component or called the error component. As the seasonality increase with the level of the series, a multiplicative model was used to estimate the seasonal index. Under this model, the trend has the same units as the original series, but the seasonal and irregular components are unitless factors, distributed around 1. As the underlying level of the series changes, the magnitude of the seasonal fluctuations varies as well. The seasonal index was the average deviation of each month's ozone value from the ozone level that was due to the other components in that month.

In trend analysis, Box-Jenkins Autoregressive Integrated Moving Average (ARIMA) model was applied to model the time series behavior in generating the forecasting trend. The methodology consisting of a four-step iterative procedure was used in this study. The first step is model identification, where the historical data are used to tentatively identify an appropriate Box-Jenkins model followed by estimation of the parameters of the tentatively identified model. Subsequently, the diagnostic checking step must be executed to check the adequacy of the identified model in order to choose the best model. A better model ought to be identified if the model is inadequate. Finally, the best model is used to establish the time series forecasting value.

In model identification (step 1), the data was examined to check for the most appropriate class of ARIMA processes through selecting the order of the consecutive and seasonal differencing required to make series stationary, as well as specifying the order of the regular and seasonal auto regressive and moving average polynomials necessary to adequately represent the time series model. The Autocorrelation Function (ACF) and the Partial Autocorrelation Function (PACF) are the most important elements of time series analysis and forecasting. The ACF measures the amount of linear dependence between observations in a time series that are separated by a lag k. The PACF plot helps to determine how many auto regressive terms are necessary to reveal one or more of the following characteristics: time lags where high correlations appear, seasonality of the series, trend either in the mean level or in the variance of the series. The general model introduced by Box and Jenkins includes autoregressive and moving average parameters as well as differencing in the formulation of the model.

The three types of parameters in the model are: the autoregressive parameters (p), the number of differencing passes (d) and moving average parameters (q). Box-Jenkins model are summarized as ARIMA (p, d, q). For example, a model described as ARIMA (1,1,1) means that this contains 1 autoregressive (p) parameter and 1 moving average (q) parameter for the time series data after it was differenced once to attain stationary. In addition to the non-seasonal ARIMA (p, d, q) model, introduced above, we could identify seasonal ARIMA (P, D, Q) parameters for our data. These parameters are: seasonal autoregressive (P), seasonal differencing (D) and seasonal moving average (Q). Seasonality is defined as a pattern that repeats itself over fixed interval of time. In general, seasonality can be found by identifying a large autocorrelation coefficient or large partial autocorrelation coefficient at a seasonal lag. For example, ARIMA (1,1,1)(1,1,1)12 describes a model that includes 1 autoregressive parameter, 1 moving average parameter, 1 seasonal autoregressive parameter and 1 seasonal moving average parameter. These parameters were computed after the series was differenced once at lag 1 and differenced once at lag 12.

The general form of the above model describing the current value Zt of a time series by its own past is:

(1ϕ1B)(1α1B12)(1B)(1B12) Zt= (1θ1B) (1γ1B12) εt E1

Where:

1−ϕ1Β = non seasonal autoregressive of order 1

1−α1 Β12 = seasonal autoregressive of order 1

Zt = the current value of the time series examined

B = the backward shift operator BZt = Zt-1 and B12Zt= Zt-12

1-B = 1st order non-seasonal difference

1-B12 = seasonal difference of order 1

1−θ1Β = non seasonal moving average of order 1

1−γ1Β12 = seasonal moving average of order 1

For the seasonal model, we used the Akaike Information Criterion (AIC) for model selection. The AIC is a combination of two conflicting factors: the mean square error and the number of estimated parameters of a model. Generally, the model with smallest value of AIC is chosen as the best model [12].

After choosing the most appropriate model, the model parameters are estimated (step 2) - the plot of the ACF and PACF of the stationary data was examined to identify what autoregressive or moving average terms are suggested. Here, values of the parameters are chosen using the least square method to make the Sum of the Squared Residuals (SSR) between the real data and the estimated values as small as possible. In most cases, nonlinear estimation method is used to estimate the above identified parameters to maximize the likelihood (probability) of the observed series given the parameter values [13].

In diagnose checking step (step 3), the residuals from the fitted model is examined against adequacy. This is usually done by correlation analysis through the residual ACF plots and the goodness-of-fit test by means of Chi-square statistics χ2. If the residuals are correlated, then the model should be refined as in step one above. Otherwise, the autocorrelations are white noise and the model is adequate to represent our time series.

The final stage for the modeling process (step 4) is forecasting, which gives results as three different options: - forecasted values, upper, and lower limits that provide a confidence interval of 95%. Any forecasted values within the confidence limit are satisfactory. Finally, the accuracy of the model is checked with the Mean-Square error (MS) to compare fits of different ARIMA models. A lower MS value corresponds to a better fitting model.

Advertisement

3. Results and discussion

The first step in time series analysis is to draw time series plot which provide a preliminary understanding of time behavior of the series as shown in Figure 2. Trend of the original series appear to be slightly increasing. Nonetheless, this needs to be tested and conformed through descriptive analysis and trend modeling.

Figure 2.

Original monthly ozone concentration for Kemaman

In seasonality of ozone, a well-defined annual cycle was consistent with the highest ozone means occurring in August, and the lowest ozone means in November (Figure 3). Table 1 show the seasonal indices range from a low of 80.047 in November to a high of 122.058 in August. This indicates that there is a seasonal swing from 80.047% of average to 122.058% of average throughout the course of one complete cycle i.e. one year. The seasonal variation pattern in Kemaman differed from other countries, such as United States, United Kingdom, Italy, Canada, and Japan, in that the peak ozone concentration did not correspond to maximum photochemical activity in summer [14,15,16].

For the purpose of forecasting the trend in this study, the first 132 observations (January 1996 to December 2006) were used to fit the ARIMA models while the subsequent 12 observations (from January 2007 to December 2007) were kept for the post sample forecast accuracy check. Ozone concentrations data has been adjusted in the following way before the model was fit: - simple differences of order 1 and seasonal differences of order 1 were taken. The model with the lowest value (-11.8601) of the Akaike Information Criterion (AIC)

Figure 3.

Annual variation of monthly ozone means

MonthSeasonal Index
January107.199
February90.8259
March84.7179
April80.7204
May101.135
June105.618
July115.073
August122.058
September117.771
October93.0941
November80.0473
December101.741

Table 1.

Seasonal Index of Ozone

is (ARIMA) (0, 1, 1) x (1, 1, 2)12 was selected and has been used to generate the forecasts (Figure 4). This model assumes that the best forecast for future data is given by a parametric model relating the most recent data value to previous data values and previous noise. As shown in Table 2, The P-value for the MA (1) term, SAR (1) term, SMA (1) term and SMA (2) term, respectively are less than 0.05, so they are significantly different from 0. Meanwhile, the estimated standard deviation of the input white noise equals 0.00277984. Since no tests are statistically significant at the 95% or higher confidence level, the current model is adequate to represent the data and could be used to forecast the upcoming ozone concentration. Therefore, we can assume that the best model for ground level ozone in Kemaman is the mathematical expression:

Z(t)=a(t)+0.53a(t12)0.82(t1)1.67a(t12)+0.73a(t24)+0.82(1.67)a(t13)0.82(0.73)a(t25)E2

Figure 4.

Model predicted plot of ozone concentration with actual and 95% confidence band

ParameterEstimateStnd. ErrorTP-value
MA(1)0.8187860.047813317.12460.000000
SAR(1)0.5317450.1462133.636780.000400
SMA(1)1.673740.09247418.09960.000000
SMA(2)-0.7286890.081741-8.914610.000000

Table 2.

ARIMA (0, 1, 1) x (1, 1, 2)12 model parameter characteristics

Model*RMSEMAEMAPEMEMPEAIC
(A)0.003370.0024913.5260.000004-1.5017-11.2253
(B)0.002710.0020611.0860.000002-1.8498-11.6431
(C)0.002690.0020110.7860.000002-1.8157-11.6409
(H)0.002670.0019810.7070.000003-1.7185-11.6712
(I)0.002710.0020110.870-0.000050-1.9817-11.6423
(J)0.002700.0019910.6710.000206-0.5469-11.6286
(M)0.002580.0020611.2500.000031-1.5638-11.8601
(N)0.002570.0020411.192-0.00009-2.0636-11.8478
(O)0.002580.0020611.298-0.000053-1.9673-11.8392
(P)0.002590.0020711.260-1.7473-11.8382
(Q)0.002590.0020711.2670.000030-1.5789-11.8335

Table 3.

Model Comparison

According to plots of residual ACF (Figure 5) and PACF (Figure 6), residuals are white noise and not-auto correlated. Furthermore, as shown in Figure 7 of normal probability plot, residuals of the model are normal.

Figure 5.

Residual autocorrelation functions (ACF) plot

Figure 6.

Residual partial autocorrelation (PACF) functions plot

Figure 7.

Residual normal probability plot

Based on the prediction for ozone concentration (Figure 4), there is a statistical significant upward trend at Kemaman station. The detection of a steady statistical significant upward trend for ozone concentration in Kemaman is quite alarming. This is likely due to sources of ozone precursors related to industrial activities from nearby areas and the increase in road traffic volume.

Advertisement

4. Conclusion

Time series analysis is an important tool in modeling and forecasting air pollutants. Although, this piece of information was not appropriate to predict the exact monthly ozone concentration, ARIMA (0, 1, 1) x (1, 1, 2)12 model give us information that can help the decision makers establish strategies, priorities and proper use of fossil fuel resources in Kemaman. This is very important because ground level ozone (O3) is formed from NOx and VOCs brought about by human activities (largely the combustion of fossil fuel). In summary, the ozone level increased steadily in Kemaman area and is predicted to exceed 40 ppb by 2019 if no effective countermeasures are introduced.

Acknowledgement

The researchers would like to thank DOE Malaysia for providing pollutants data from 1996-2007 and the Ministry of Higher Education (MOHE) for allocating a research grant to accomplish this study.

References

  1. 1. WuH. W. Y.ChanL. Y. 2001 Time Series Analysis of Surface Ozone Monitoring Records in Kemaman, Malaysia Journal Of Environmental International 26 213222
  2. 2. ChanC. Y.ChanL. Y. 2000 The Time Series Analysis of Surface Ozone Monitoring Records in Kemaman, Malaysia Journal of Geophysical Research 105 2070720719 .
  3. 3. KyriakidisP. C.JournalA. 2001 2001. Stochastic Modelling of Atmospheric Pollution: A Special Time Series Framework, Part II: Application to Monitoring Monthly Sulfate Deposition Over Europe. Journal of Atmospheric Environment 35 23392348
  4. 4. SalcedoR. L. R.AlvimF. M.AlvesC.Martins 1999M., Alves, C. & Martins, F. 1999. Time Series Analysis of Air Pollution data. Journal of Atmospheric Environment 33 23612372 .
  5. 5. SchwartzJ.Marcus 1990 1990. Time Series Analysis of Surface Ozone Monitoring Records in Kemaman, Malaysia American Journal of Epidemiology 131 85194
  6. 6. HiesT.TreffeisenR.SebaldL.Reimer 2003 2003. Time Series Analysis of Surface Ozone Monitoring Records in Kemaman, Malaysia Time Series. Journal of Atmospheric Environment 34 34953502
  7. 7. KocakK.SaylanL.Sen 2000 2000. Time Series Analysis of Surface Ozone Monitoring Records in Kemaman, Malaysia. Journal of Atmospheric Environment 34 12671271
  8. 8. Kuang-Jung Hsu. 2003 Time Series Analysis of the Independence Among Air Pollutants. Atmospheric Environment. 26B 4 491503 ,1992.
  9. 9. OmidraviM.HassanzadahS.HossicinibalamF. 2008 Time Series Analysis of Ozone Data in Isfahan. Physica A. Statistical Mechanics and Its Applications, 387 (16-17),43934403 .
  10. 10. AfrozR.HassanM. N.IbrahimN. A. 2003 Time Series Analysis of Surface Ozone Monitoring Records in Kemaman, Malaysia Journal of Environmental Research 92 2 7177
  11. 11. ASMA., 2008 Alam Sekitar Malaysia Sdn Bhd. http://www.enviromalaysia.com.my 23/12/09.
  12. 12. Hong, 1997 1997. Time Series Analysis of Surface Ozone Monitoring Records in Kemaman, Malaysiaa. Msc. Thesis, North Dakota State University.
  13. 13. NaillP. E.MomaniM. 2009 Time Series Analysis of Surface Ozone Monitoring Records in Kemaman, Malaysia. American Journal of Environmental Sciences 5 5 599604
  14. 14. AngleR. P.SandhuH. S. 1989 Time Series Analysis of Surface Ozone Monitoring Records in Kemaman, Malaysia. Journal of Atmospheric Environment 23 215221
  15. 15. Colbeck, I., MacKenzie, A.R. Air Quality Monographs, Air Pollution by Photochemical Oxidants, 1 1 Amsterdam: Elsevier,1994 107171 , 232-326.
  16. 16. LorenziniG.NaliC.Panicucci 1994 1994. Time Series Analysis of Surface Ozone Monitoring Records in Kemaman, Malaysia. Journal of Atmospheric Environment 28 31553164 .
  17. 17. BencalaK. E.SeinfieldJ. H. 1979 Time Series Analysis of Surface Ozone Monitoring Records in Kemaman, Malaysia Journal of Atmospheric Environment 10 941950
  18. 18. GouveiaN.Fletcher 2000 2000. Time Series Analysis of Surface Ozone Monitoring Records in Kemaman, Malaysias. Journal of Epidemiology and Community Health 54 750755
  19. 19. HertzbergA. M.Frew 2003 2003. Can Public Policy be Influenced? Environmetrics 14 1 110
  20. 20. LeeC. K. 2002 Multiracial Characteristics in Air Pollutant Concentration Time Series. Journal of Water Air Soil Pollution 135 389409
  21. 21. LiuC. M.HungC. Y.ShiehS. L.WuC. C. 1994 Time Series Analysis of Surface Ozone Monitoring Records in Kemaman, Malaysia. Journal of Atmospheric Environment 28 159173
  22. 22. Roberts, 2003 2003. Time Series Analysis of Surface Ozone Monitoring Records in Kemaman, Malaysia. Journal of Atmospheric Environment 35 23392348 .
  23. 23. TouloumiG.AtkinsonR.TerteA. L. 2004 Time Series Analysis of Surface Ozone Monitoring Records in Kemaman, Malaysia. Journal of Environmetrics 15 101117
  24. 24. YeeE.Chen 1997 1997. Time Series Analysis of Surface Ozone Monitoring Records in Kemaman, Malaysiaournal of Atmospheric Environment 31 9911002

Written By

Marzuki Ismail, Azrin Suroto and Nurul Ain Ismail

Submitted: 01 December 2011 Published: 22 August 2012