The Role of Statistical Methods and Tools for Weather Forecasting and Modeling

Emmanuel P. Agbo

doi:10.5772/intechopen.96854

Abstract

The need to understand the role of statistical methods for the forecasting of climatological parameters cannot be trivialized. This study gives an in depth review on the different variations of the Mann-Kendall (M-K) trend test and how they can be applied, regression techniques (Simple and Multiple), the Angstrom-Prescott model for solar radiation, etc. The study then goes ahead to apply some of them with data obtained from the Nigerian Meteorological Agency (NiMet), and applying tools like the python programming language and Wolfram Mathematica. Results show that the maximum ambient temperature for Calabar is increasing (Z = 2.52) significantly after the calculated p-value <0.05 (significant level). The seasonal M-K test was also applied for the dry and wet seasons and both were found to be increasing (Z = 3.23 and Z = 4.04 respectively) after their calculated p-values <0.05. The relationship between refractivity and other meteorological parameters relating to it was discerned using partial differential equations giving the gradient of each with refractivity; this was compared with results from the correlation matrix to show that the water vapor contents of the atmosphere contributes significantly to the variation of refractivity. Multiple linear regression has also been adopted to give an accurate model for the prediction of refractivity in the region after the residual error between the calculated refractivity and predicted refractivity was minimal.

Keywords

meteorology
forecasting
python programming
climate
Mann-Kendall
multiple linear regression

Author Information

Show +

Emmanuel P. Agbo*
- Cross River University of Technology, Nigeria
- Lafarge Africa PLC, Mfamosing Plant, Nigeria
- Osun State University, Nigeria

*Address all correspondence to: emmanuelpaulagbo@gmail.com

1. Introduction

The importance of statistical modeling and forecasting of time series data, etc., cannot be overemphasized. The benefits ranges from easy interpretability arising from visualization of results to the removal of the mysticism factor for the layman. The word ‘forecasting’ has to do with predicting the future based on data from the past and present. This is regularly done by the analysis of trends.

A routine example might be the estimation of temperature trends for some specified future date. Compared to forecasting, prediction can be seen as a term which is more general.

Forecasting methods have been applied in different areas ranging from climatology, finance, foreign exchange, etc. This has been applied in different regions of the world for better prediction and simulation. The key distinction in Information and Communication Technology (ICT) is the fact that with this technology, we can make predictions and simulations from previously obtained data. This is true and can be applied for every area while paying attention to the rules that govern them.

In this study we will be applying some statistical methods which can be adopted for the forecasting of climatic (weather) parameters in different regions of the world.

It is important to note that the predictability of the atmosphere is not perfect, this brings into context the fact that although statistical methods are necessary, results obtained are not totally accurate which is why room for errors (uncertainties) are given, albeit, a trend can be observed [1]. Statistical methods have been applied in the study of different regions for example, Daniel S. Wilks in [1] buttressed on the use of these methods on the analyses of different regions that do not necessarily have the same climatic condition. This brings into context the fact that laws are true irrespective of the region, i.e. neglecting all other factors that have little contribution to weather, the same methods can be applied in different regions to yield accurate results.

Analysis of trends can be useful in depicting and predicting the changing patterns and erraticism of some climatic parameters. This analysis gives a proper knowledge about the changing conditions of the climate and its effects, by the evaluation of meteorological parameters.

A data scientist using any tool or software for modeling and forecasting is particularly interested in the progression of these parameters (meteorological) as a function of time(t) ft. The designers of navigation or monitoring systems cannot trivialize the importance of forecasting as this is a very important part of their system. The spatial and temporal changes of atmospheric parameters calls for the adoption of this analysis to discern the effects of some meteorological parameters on some variables; for example, see [2].

A very popular software for any data scientist that is willing to understand the nitty-gritty of weather forecasting is Python Programming. This paper will explain in detail the setup processes for this to help the layman get started. A dataset of temperature trend in Calabar, Nigeria will be used at the end of this chapter to test the processes explained for better visualization.

The applicability of results from forecasting cannot be underestimated because this is great information for people that depend on weather conditions like farmers, surfers, and event planners, etc. The accurate prediction of atmospheric parameters can go a long way in positively affecting the financials of the informed, as money can be saved by avoiding unnecessary cost during trying times [3]. Natural disasters like Tsunami can be predicted with the correlation of meteorological parameters, harnessing information as explained previously and then incorporating this information through machine learning into the design of forecasting systems.

We delve deeper into a review of statistical methods like the M-K test and its different variations, the Angstrom-Prescott model for the estimation of solar radiation, linear regression techniques, with a deep look into multiple linear regression which will be applied in predicting refractivity after obtaining the coefficients of the variables. Results will be obtained and explained.

2. Review of statistical tests/methodology

With the shift going on in the world of technology, the implementation of some time series forecasting methods will be explained as well as their python implementation techniques. We often use forecasting models on time series data for the estimation of future trends of meteorological parameters.

2.1 Statistical test for trend (Mann-Kendall trend test)

One of the most important and widely applied test for trends involving time series is the Mann-Kendall trend test. It is mostly used for environmental and hydrological data. The test is non parametric and does not necessitate the data conforming to a particular distribution, similarly, the sensitivity of the test due to an inhomogeneous series resulting to abrupt breaks is very low [4]. The null hypothesis Ho which says that there is no monotonic trend in the series, is tested against the alternative hypothesis H1 which says that there is a trend in the series. The test is applied to cases where a range of data xiis in agreement with the equation below;

xi=fti+εiE1

ftiis a function of time and εi are the range residuals with zero mean.

The Mann-Kendall test statistic S is calculated using the formula

S=∑k=1n−1∑j=k=1nsgnxj−xkE2

where;

sgnxj−xk=+1;ifxj−xk>00;ifxj−xk=0−1;ifxj−xk<0E3

n in Eq. (2) is the number of data values in the studied series. The advantage of this test is that it can handle the situation where data values are incomplete with respect to the number of years or months, etc. [4]

In the case where n is greater than or equal to 10 (10 and above), we adopt the normal approximation (Z).

To find the variance of S, ‘VAR(S)’, we compute Eq. (4) below.

VARS=118nn−12n+5−∑p=1gtptp−12tp+5E4

From the equation, the number of data values is represented by n, the number of equal of tied groups is represented by g, and the number of data values in the pth group is represented by tp.

We now use the results from VAR(S) to find the test statistic Z

Z=S−1VARS;S>00;S=0S+1VARS;S<0E5

A decreasing trend can be discerned from results of Eq. (5) when the value of Z is negative and an increasing trend when Z is positive (Table 1).

Significance level (∝)	Required n
0.1 (10%)	≥ 4
0.05 (5%)	≥ 5
0.01 (1%)	≥ 6
0.001 (10%)	≥ 7

Table 1.

Significance level (∝) required for given numbers of data.

The significance of an increasing or decreasing trend is observed when the p-value of the series is lower than the significance level (∝), in this case, we can say there is a trend observed trend in the series [5]. The adoption of different significant levels with respect to the number of given data values n is given in Table 1.

The classification of this probability/significance level is important because results can be confused to be entirely true. We need to understand that the significance level of say 0.05, means that there is a 5% probability that a mistake will be made while rejecting the null hypothesis Ho. Similarly, a significance level of 0.01 means that there is a 1% probability that a mistake will be made while rejecting Ho.

2.2 Regression analysis

The two easiest ways to forecast time series data by observation are the simple regression and the moving average, they both depend on historical data. The former demands mere observation of the previous trend and drawing up an extrapolation from there; this can be somewhat less accurate. The moving average has been used for forecasting meteorological data like rainfall (See reference [6]). Analyzing with regression has to do with the relationship one variable which is dependent has with one or more independent variables. We use them to check for models showing the strength of relationship between the variables and any possible future relationships [1].

2.2.1 Simple linear regression

This regression variation is based on the assumption that the two variables (dependent and independent variable) show a linear relationship between the intercept and the slope, similarly, there is no residual error in this regression and the value is constant across all observations.

Y=±mX±c+eE6

Y is the dependent variable.

X is the independent variable.

m is the value of the slope.

c is the intercept.

e is the residual error.

The regression is depicted by a straight line describing the Eq. (6) above (Figure 1).

Figure 1.
Schematic illustration of simple linear regression. The regression line, Y=−mX+c+e, is chosen as the one minimizing some measure of the vertical differences (the residuals) between the points and the line. The residual e is the difference between the data point and the regression line.

2.2.2 Multiple linear regression

This model is similar to that of simple linear regression, but the only exception is that it has multiple independent variables, unlike that of simple linear regression which has just the one. This can be represented by Eq. (7);

Y=±m1X1±m2X2±m3X3±c+eE7

Y is the dependent variable.

X1,X2,X3 are the independent variables.

m1,m2,m3 are the values of the slopes.

c is the intercept.

e is the residual error.

One thing to note about multiple linear regression is that the independent variables must not be collinear, i.e., they do not have to have a high correlation coefficient between each other, else there will be difficulty in assessing the relationship between the dependent and independent variables.

We also need to take note that before multiple linear regression is performed on range of data values, a linear relationship must exist between each independent variable and the dependent variable. The amount of residual error must be almost constant at each point in the model. The multiple linear regression will be applied to study and predict refractivity trend in Calabar, Nigeria. This was done with the ‘statsmodel’ package in python programming and results have been displayed in section 2.5.

A perfect meteorological equation that this regression technique can be applied to is the refractivity equation recommended by the International Telecommunication Union (ITU) shown in Eq. (8);

N=77.6PT+3.73×105eT2N−unitsE8

P is the Atmospheric Pressure (hPa).

e is the Atmospheric Vapor Pressure (hPa).

T is the Absolute Temperature (K).

Eq. (8) shows the relationship between refractivity (dependent variable) and meteorological parameters (ambient temperature, atmospheric pressure, and vapor pressure) which are all independent variables.

This has been applied in [7] modeling the meteorological parameters for the accurate determination of refractivity. These meteorological parameters (Ambient Temperature, Atmospheric Pressure and Relative Humidity) have been obtained from the Nigeria meteorological Agency (NiMet), Calabar.

Results have been presented in section 2.5. From Eq. (8), we obtain the atmospheric vapor pressure e from the relation;

e=esH100hPaE9

es is the saturated vapor pressure (hPa) calculated from;

es=6.11exp17.26T−273.16T−35.87hPaE10

2.3 Review of the application of simple linear regression analysis in climatology (the Angstrom-Prescott model)

The linear regression technique can be applied to find the relationships between an independent variable and the dependent variable. We can see the explanation of this from Eq. (6).

One major example of the benefits of linear regression is the estimation of the Angstrom-Prescott coefficients of the Angstrom-Prescott model for a particular region as this relates to solar radiation. The Angstrom-Prescott model is given by [8];

HH0=a+bnNE11

where the monthly average daily extraterrestrial radiation is given by H₀, H is the monthly average daily global radiation in Wh/m²/day. n is the actual sunshine duration in a day for a particular region (hours), N is the monthly mean length of the day in hours. The Angstrom-Prescott empirical coefficients are given by a and b. The linear regression technique has been adopted by Srivastava and Pandey [8] to find by a and b. Comparing Eq. (6) to Eq. (9) we have that;

HH0=YvariablenN=Xvariableb=m=slopea=c=YinterceptE12

This shows that if we have the variables ‘HH0 and nN’, we can get the values of a and b, from our Y intercept and slope respectively. Getting these constant values for specific regions will help us forecast future trends.

For better understanding, the extraterrestrial radiation H₀ is given by the equation [9];

H0=24×3600×ISCπ×1+0.33cos360×d365×cosϕcosδsinω+πω180sinϕsinδE13

Here, ISC is the solar constant with a value of 1367 W/m², d represents the day of the year (from January 1st to December 31st); taking January 1st as 1 and December 31st as 365 or 366 (in the case of a leap year). The latitude of the study location, the declination angle and the sunset hour angle are represented by ϕ,δ, and ω respectively. ω=cos−1−tanϕtanδ. The declination angle can be obtained from [9].

δ=23.45sin360284+d365E14

The monthly mean length of the day (in hours) can be obtained from [9].

N=2ω15E15

The above equations can be applied to estimate the coefficients using linear regression. By this we can use these coefficients to predict solar radiation for a given region.

We know that the declination angle ranges from −23.5≤δ≤+23.5. From Figure 2, we can see that the declination angle is 0 ° C at the Verbal and Autumnal Equinox, while the angles are −23.5 and + 23.5 at the summer and winter solstice respectively. It is easy to see why this has a huge effect on the variation of Global solar radiation.

Figure 2.
Yearly variation of declination angle δ with respect to the days of the year.

Klein in 1977 [10] recommended average days of the various months and corresponding angle of declination as in Table 2.

Month	Date	Day of the year (d)	declination angle (δ)
January	17	17	−20.9
February	16	47	−13
March	16	75	−2.4
April	15	105	9.4
May	15	135	18.8
June	11	162	23.1
July	17	198	21.2
August	16	228	13.5
September	15	258	2.2
October	15	288	9.6
November	14	318	−18.9
December	10	344	−23

Table 2.

Recommended average days for various months and their corresponding declination angles [10].

2.4 Calculus in climatology

Applying calculus in environmental science is important in predicting a lot of things. It can be applied to understand the impacts of parameters on the variations of other parameters that they relate to. It is important to know that calculus is the ‘mathematical study continuous change’ so this can be applied in climatology to discern the impacts of some parameters on the “continuous change” of others [11, 12, 13].

Writing the refractivity equation in terms of relative humidity H, by substituting (10) into (9), and the into (8), we have;

N=77.6PT+3.73×1056.11exp17.26T−273.16T−35.87×0.01HT2N−unitsE16

Similarly, obtaining refractivity in terms of the saturated vapor pressure es using Eq. (8) and (9) gives;

N=77.6PT+3.73×105esH100T2N−unitsE17

Now applying partial differentials to the equations for refractivity; Eqs. (8), (16), and (17), we obtain partial differentials relating each parameter to refractivity;

∂N∂P=77.6T∂N∂T=−77.6PT2+7.46×105eT3∂N∂H=22790.3exp17.26−273.16+T−35.87+TT2∂N∂e=3.73×105T2∂N∂es=3.73×103×HT2E18

From monthly Temperature, Humidity and Atmospheric pressure data obtained for 2005–2018 from the archives of the Nigerian meteorological agency (NiMet) Calabar, the atmospheric vapor pressure and the saturated vapor pressure can be obtained by applying these parameters in Eqs. (9) and (10) (Figure 3).

Figure 3.
Map of study area showing Calabar as a coastal area (left) and the exact location of the Nigerian meteorological agency (NiMet) where the data was obtained (right).

2.5 Python implementation for Mann-Kendall trend test

With the python software installed, the next step will be installing an IDE (integrated development environment). The easiest IDE to use is the Jupyter Notebook. This IDE displays results as you code.

We will walk you through the processes for analyzing data by using the data for Calabar in the south of Nigeria, collected from the archives of the Nigeria meteorological agency (NiMet). Research has been done in this area in climatology [14, 15, 16, 17, 18], but with the application of python and the Mann-Kendall test can give more meaning to time series data.

We need to install the python package for the Mann-Kendall test called ‘pymannkendall’. To install this package, the following python packages are required;

Numpy
Scipy

For handling and cleaning data we need the ‘pandas’ package, and for data visualization we need the ‘matplotlib’ package.

We want to analyze maximum ambient temperature data for 20 years in Calabar.

In the Jupyter notebook, the first step will be to import the respective packages. We must also note that for our examples in the Appendices, we stored the excel file containing the data used for the analysis in the same folder as the python file for easy reference.

Appendix A shows the process of importing the installed packages required for the analysis into the workspace.

Before we perform the Mann-Kendall test, we need to import the excel file titled ‘Temperature’ in which the table is stored, in a sheet name called ‘MAX’. See Appendix B.

Appendix C shows how the Mann-Kendall original test is performed after importing the packages and data. We assigned the name of the imported data file as ‘Max’ and set the significance level (∝) to the default 5% (0.05); this can be adjusted by the user to his/her preference. Results were obtained and displayed in Appendix C.

We now perform the seasonal M-K test for the dry season variation, we import the excel file titled ‘Temperature’, the date column will be an index column. The sheet name of the excel file in which the data is stored is called ‘dry’. This implementation can be seen from Appendix D.

Appendix E shows the seasonal M-K test python implementation for the dry season variation. By setting the significance level (∝) to the default 5% (0.05), and the period to 4, which stands for the 4 months of the dry season in the study area (November to February), we have satisfied the criteria for the seasonal M-K test.

For the wet season variation, the excel file titled ‘Temperature’ will be imported and the date will be an index column. The sheet name is called ‘wet’. Appendix F shows the implementation code for this importation.

We can now perform the seasonal Mann-Kendall test on the wet season data. Appendix G shows this. The Seasonal Mann-Kendall test of the imported file we assigned the name ‘wet’ has been achieved by setting the significance level (∝) to the default 5% (0.05); this can be adjusted by the user to his preference. We also set the period to 8, which stands for the 8 months of the wet season in the study area (March to October).

There are other variations of the Mann-Kendall test along with their python implementation [19]. These can be used depending on the data obtained and the aim of the test.

Hamed and Rao Modified MK Test (hamed_rao_modification_test): This test addresses serial correlation issues
Yue and Wang Modified MK Test (yue_wang_modification_test): This is also a variance correction method for considered serial autocorrelation proposed by Yue, S., & Wang, C. Y. (2004). User can also set their desired significant n lags for the calculation.
Modified MK test using Pre-Whitening method (pre_whitening_modification_test): This test pre-whitens the time series before applying the trend test
Modified MK test using Trend Free Pre-Whitening method (trend_free_pre_whitening_modification_test): This test removes the trend component from the series before pre-whitening and the applying the trend test
Multivariate MK Test (multivariate_test): As the name implies, this test is for multivariate (multiple) parameters. This can be used for monthly data, where each month can be considered as a parameter.
Regional MK Test (regional_test): As the name implies, this calculates the trend at a regional scale
Correlated Multivariate MK Test (correlated_multivariate_test): Unlike the Multivariate MK test, this test is also a multivariate mk test, but the parameters are correlated.
Correlated Seasonal MK Test (correlated_seasonal_test): This test is similar to the seasonal MK test, but in this is used when the time series is significantly correlated with previous seasons/months
Partial MK Test (partial_test): Due to the fact that in some studies, many factors can affect the dependent parameters, so we overcome this by inputting one dependent parameter and an independent parameter.
Theil-Sen’s Slope Estimator (sens_slope): This test method proposed by Theil (1950) and Sen (1968) [20] is applied to estimate the magnitude of the monotonic trend.
Seasonal Theil-Sen’s Slope Estimator (seasonal_sens_slope): This test method considers the seasonal effect of the Theil-Sen’s Slope Estimator.

3. Results and discussion

3.1 Results

3.2 Discussion

For the annual variation in Figure 4, results show that there is a trend in the series as the p-value is less than the significance level (0.05). The positive Z value (observed from Appendix C) shows that the series is increasing. We can conclude that the maximum ambient temperature variation is increasing, and it is doing so with significance, the slope of the trend can be observed from the results in Appendix C.

Figure 4.
Mann-Kendall trend of maximum ambient temperature.

For the dry season variation observed in Figure 5, results show that there is a trend in the series. The positive Z value of the dry season trend observed from Appendix E shows that the series is increasing. We can conclude that the maximum temperature variation in the dry season is increasing significantly as the calculated p-value is less than the significance level (0.05), the slope of the trend can be observed from results in Appendix E.

Figure 5.
Seasonal trend of maximum ambient temperature for dry and wet season.

For the wet season variation observed also in Figure 5, results show that there is a trend in the series. The positive Z value from Appendix G shows that the series is increasing. We can conclude that the maximum temperature variation in the dry season is increasing significantly as the calculated p-value is less than the significance level (0.05), the slope of the trend can be observed from the results in Appendix G.

These results are in agreement with Agbo et al. [2] for the same region.

3.2.1 Relationship between refractivity and meteorological parameters

To understand the relationship between refractivity and all parameters relating to it, we adopt Eq. (18) by substituting obtained and calculated data.

From the data obtained at the Nigerian Meteorological Agency (NiMet) Calabar, and adopting Eq. (9) and (10) we obtain the total annual values for the meteorological parameters as;

P = 1005.97 hPa; H = 85.71%; T = 300.28 K; e = 30.71 hPa; e_s = 35.94 hPa. Substituting these values into the equations in Eq. (18), we obtain;

∂N∂P=0.258425∂N∂T=−0.0196183∂N∂H=1.48436∂N∂e=4.13672∂N∂es=3.62832E19

Results from the gradients of the differential equations in Eq. (19) show that the vapor pressure and saturated vapor pressure contributes more to the variation of refractivity. The relative humidity similarly has a high gradient; this can be physically explained by relating the water vapor content of the atmosphere to the variation of refractivity.

Figure 6.
Correlation matrix of atmospheric parameters and refractivity.

The correlation plot of refractivity and all other meteorological parameters is shown in Figure 6. Results agree with that of the differential equations in Eq. (19). As seen in Eq. (19), the correlation plot showed that the atmospheric vapor pressure and relative humidity had high positive relationships with refractivity. The saturated vapor pressure however has a low correlation coefficient compared to the high gradient in Eq. (19); this can be interpreted thus; that the variation of the saturated vapor pressure has a relatively high contribution to the variation of refractivity, but the saturated vapor pressure does not have a similar trend to that of refractivity.

3.2.2 Application of multiple linear regression in climatology

Multiple linear regression has been applied to relate refractivity with obtained meteorological parameters. The goal is to obtain an equation that relates refractivity to meteorological parameters through Multiple Linear Regression (MLG). Using Eq. (8) to calculate refractivity, we show results in Table 3. As part of the conditions for carrying out multiple linear regression, we have to test for collinearity between the independent variables. We see from the correlation matrix in Figure 6 that the independent variables are not collinear, hence this satisfies the criteria for carrying out MLG.

Year	Pressure	Temperature	Humidity	Refractivity (N)
2005	1005.15	300.23	87.15	388.88
2006	1005.38	300.17	85.38	385.73
2007	1005.50	300.10	84.84	384.74
2008	1005.44	300.21	86.00	387.11
2009	1005.83	300.29	83.36	383.75
2010	1005.46	300.71	83.26	385.62
2011	1005.80	300.12	87.49	388.83
2012	1005.75	300.44	87.97	391.57
2013	1005.74	300.03	87.07	387.69
2014	1005.92	299.79	85.15	383.38
2015	1006.55	300.03	85.68	385.83
2016	1006.97	300.70	85.69	389.73
2017	1007.09	300.58	86.33	389.90
2018	1007.02	300.52	84.58	386.84

Table 3.

Data of obtained meteorological parameters and refractivity.

From our analysis we obtain the coefficients (slopes) of the variables (meteorological parameters) and the intercept from Table 4 to form the equation below;

	C	S_e	t Stat	P-value	Lower 95%	Upper 95%
Intercept	1617.97	51.05	−31.69	2.30 × 10⁻¹¹	−1731.72	−1504.23
Pressure	0.17	0.05	3.25	8.75 × 10⁻⁰³	0.05	0.28
Temperature	5.68	0.13	44.90	7.22 × 10⁻¹³	5.39	5.96
Humidity	1.53	0.02	68.62	1.05 × 10⁻¹⁴	1.48	1.58

Table 4.

Output of the multiple linear regression showing the coefficients (C) of each parameter and their standard error (S_e).

RefractivityN=1.53Humidity+0.17Pressure+5.68Temperature−1617.97E20

The above equation can be used to accurately predict the variation of refractivity, given the values of the meteorological parameters. Table 4 shows these results obtained from the multiple linear regression. The values for the predicted refractivity (Predicted N) was gotten from Eq. (20) by substituting the values of the meteorological parameters. This equation is more straight forward that the equation recommended by ITU as all the variables and coefficients are all linear with respect to refractivity.

Figure 7 shows the trend of refractivity calculated from Eq. (8) with that of predicted refractivity, calculated from Eq. (20). The residual error seen from Table 5 shows relatively constant values (in agreement with our MLG conditions), and a small deviation from the original values of refractivity.

Figure 7.
Comparison plot of annual refractivity and predicted refractivity.

Year	N	Predicted N	Residuals
2005	388.88	388.87	0.005
2006	385.73	385.85	−0.121
2007	384.74	384.69	0.056
2008	387.11	387.09	0.025
2009	383.75	383.54	0.204
2010	385.62	385.72	−0.109
2011	388.83	388.89	−0.060
2012	391.57	391.45	0.124
2013	387.69	387.74	−0.048
2014	383.38	383.45	−0.074
2015	385.83	385.74	0.091
2016	389.73	389.65	0.076
2017	389.90	389.97	−0.072
2018	386.84	386.93	−0.095

Table 5.

Residual output derived from the results of the coefficients, showing the predicted refractivity values compared to the refractivity values to give the residuals.

From Table 4 probability values (p-values) of the parameters are all less than the significance level (5% = 0.05; 95% confidence level), this shows that the variation agrees with the alternative hypothesis and shows a trend relating the independent variables to the dependent variables.

Results from Figure 7 show the minimal error between the predicted refractivity and the calculated refractivity. Table 5 shows the values for both as well as the residual error between them. This shows that the error is small and thus, Eq. (20) can be adopted for the prediction of refractivity for the study area. This equation can be modified and refractivity N can be gotten in terms of other parameters like the saturated vapor pressure and the atmospheric vapor pressure.

4. Conclusion

There are myriads of ways in which weather can be forecasted and this arises from the understanding of basic meteorological parameters and how they behave in the atmosphere; and also from the understanding of the role of statistics in climate research [21]. Research in this area has been reviewed to give a better understanding of the different techniques for analyzing trends; which include, Linear Regression (Multiple and Simple), the Mann-Kendall trend test [22, 23] (to test for trends in a time series variation), the Angstrom-Prescott model for estimating solar radiation as well as the python implementation of some various techniques.

The multiple linear regression technique was applied to model an equation to accurately predict the trend for refractivity in the study location, the simple linear regression technique has been explained as well as accurate methods for its application in the predicting/estimation of the Angstrom-Prescott coefficients. These coefficients can be gotten for specific regions and can be accurately applied to predict solar radiation in that region.

Results from the multiple linear regression gave an accurate model for the prediction of refractivity in the region after the residual error between the calculated refractivity and predicted refractivity was minimal.

The Mann-Kendall original and seasonal test has been applied to analyze the maximum temperature in Calabar, Nigeria for the annual and seasonal (dry and wet season) variation respectively, and results show that the annual, dry season and wet season had increasing variations (after having positive Kendall Z-values of 2.52, 3.23, 4.04 respectively) and they were all increasing significantly at 5% (0.05) level of significance after their p-values were all less than 0.05 agreeing with Agbo and Ekpo [23].

The relationship between refractivity and other meteorological parameters relating to it was discerned using partial differential equations giving the gradient of each with refractivity; this was compared with results from the correlation matrix to show that the water vapor contents of the atmosphere contributes significantly to the variation of refractivity.

Acknowledgments

The author will like to acknowledge the Nigerian Meteorological Agency (NiMet) Calabar for providing the necessary data for applying in this study.

The author will also like to express his thanks and appreciation to the editor, whose comments greatly improved the chapter.

Conflict of interest

The author declares no conflict of interest.

A.

A.1 Appendix A

Input:

import numpy as np

import pandas as pd

import pymannkendall as mk

import matplotlib.pyplot as plt

%matplotlib inline

from pandas import ExcelWriter

from pandas import ExcelFile

from matplotlib.figure import Figure

This will import the above installed packages into the workspace.

A.2 Appendix B

Max = pd.read_excel(“Temperature.xlsx”, ‘MAX’ index_col= ‘YEAR’)

The excel file titled ‘Temperature’ will be imported and the data will be an index column. The sheet name is called ‘MAX’.

We can now perform the Mann-Kendall test

A.3 Appendix C

Input

mk.original_test(Max, alpha=0.05)

Output

Mann_Kendall_Test(trend=‘increasing’, h=True, p=0.011793457077065028, z=2.518264946676251, Tau=0.5164835164835165, s=47.0, var_s=333.6666666666667, slope=0.06763844012453053, intercept=303.5218288324476)

A.4 Appendix D

dry=pd.read_excel(“Temperature data.xlsx”, ‘Sheet2’, index_col= ‘YEAR’)

The excel file titled ‘Temperature’ will be imported and the data will be an index column. The sheet name is called ‘dry’.

We can now perform the Mann-Kendall test

A.5 Appendix E

Input

mk.seasonal_test(dry, alpha=0.05, period=4)

Output

Seasonal_Mann_Kendall_Test(trend=‘increasing’, h=True, p=0.001232892414896325, z=3.231159219618304, Tau=0.3269230769230769, s=119.0, var_s=1333.6666666666667, slope=0.08467049808428379, intercept=305.1036046113848)

A.6 Appendix F

wet=pd.read_excel(“Temperature data.xlsx”, ‘Sheet3’, index_col= ‘YEAR’)

The excel file titled ‘Temperature’ will be imported and the data will be an index column. The sheet name is called ‘wet’.

We can now perform the Mann-Kendall test

A.7 Appendix G

Input

mk.seasonal_test(wet, alpha=0.05, period=8).

Output

Seasonal_Mann_Kendall_Test(trend=‘increasing’, h=True, p=5.126153098378161e-05, z=4.049799512953561, Tau=0.28846153846153844, s=210.0, var_s=2663.3333333333335, slope=0.05741935483871145, intercept=302.85004032258064)

References

1. Statistical weather forecasting. In. Daniel S. Wilks S, editor. Elsevier: International Geophysics; 1995. pp. 159-232. DOI: 10.1016/S0074-6142(06)80042-2
2. Agbo EP, Ettah EB, Eno EE. The impacts of meteorological parameters on the seasonal, monthly, and annual variation of radio refractivity. Indian Journal of Physics. 2021; 95: 195-207. DOI: 10.1007/s12648-020-01711-9
3. Datta A, Si S, Biswas S. Complete Statistical Analysis to Weather Forecasting. In: Das A., Nayak J., Naik B., Pati S., Pelusi D. (eds) Computational Intelligence in Pattern Recognition. Advances in Intelligent Systems and Computing, 2020. vol. 999. Springer. Singapore. DOI: 10.1007/978-981-13-9042-5_65
4. Alhaji UU, Yusuf AS, Edet CO, Oche CO, Agbo EP. Trend analysis of temperature in Gombe state using Mann Kendall Trend test. Journal of Scientific Research and Reports. 2018;20(1):1-9. DOI: 10.9734/JSRR/2018/42029
5. Agbo EP, Ekpo CM, Edet CO. Analysis of the effects of meteorological parameters on radio refractivity, equivalent potential temperature and field strength via Mann-Kendall test. Theoretical and Applied Climatology. 2021 Feb; 143 (3): 1437-1456. DOI: 10.1007/s00704-020-03464-1
6. Akrami SA, El-Shafie A, Naseri M, Santo CAG. Rainfall data analyzing using moving average (MA) model and wavelet multi-resolution intelligent model for noise evaluation to improve the forecasting accuracy. Neural Comput & Applic. 2014;25:1853-1861. DOI: 10.1007/s00521-014-1675-0
7. Ajileye OO, Popoola OS, Kayode FF, Rabiu AB. Meteorological models for determination of surface radio refractivity over Nigeria. Current Journal of Applied Science and Technology. 2019;36(3):1-15. DOI: 10.9734/CJAST/2019/v36i330235
8. Srivastava RC, Pandey H. Estimating Angstrom-Prescott coefficients for India and developing a correlation between sunshine hours and global solar radiation for India. International Scholarly Research Notices. 2013;2013
9. Razmjoo A, Heibati SM, Ghadimi M, Qolipour M, Nasab JR. Using Angstrom-Prescott (A-P) Method for Estimating Monthly Global Solar Radiation in Kashan. J Fundam Renewable Energy Appl. 2016: 6(5). DOI: 10.4172/2090-4541.1000214
10. Klein SA. Calculation of monthly average insolation on tilted surfaces. Solar Energy. 1977;19(4):325-329
11. Ustrnul Z, Czekierda D. Application of GIS for the development of climatological air temperature maps: an example from Poland. Meteorological Applications: A journal of forecasting, practical applications, training techniques and modelling. 2005 Mar;12(1):43-50
12. Testud J, Chong M. Three-dimensional wind field analysis from dual-Doppler radar data. Part I: Filtering, interpolating and differentiating the raw data. Journal of Applied Meteorology and Climatology. 1983 Jul 1;22(7):1204-1215
13. Błażejczyk KR, Matzarakis AN. Assessment of bioclimatic differentiation of Poland based on the human heat balance. Geographia Polonica. 2007;80(1):63-82
14. Agbo EP, Ekpo CM. Trend Analysis of the Variations of Ambient Temperature Using Mann-Kendall Test and Sen’s Estimate in Calabar. Southern Nigeria
15. Edet CO, Eno EE, Ettah EB, Kamgba FA. Seasonal Variation of Radio Refractivity in Calabar, Nigeria. measurements. 2017 Jun;6(6)
16. Edet CO, Eno EE, Ettah EB. Monthly Variation of Radio Refractivity in Calabar, Nigeria. International Journal of Scientific Research Engineering & Technology. IJSRET: ISSN; 2017. pp. 2278-0882
17. Edet CO, Eno EE, Ettah EB. Effects of Variations in Meteorological Parameters of Atmospheric Pressure. Relative Humidity and Temperature on Radio Refractivity in Calabar
18. Kamgba FA, Edet CO, Njok AO. Effects of some meteorological parameters on wind energy potential in Calabar, Nigeria. Asian journal of physical and chemical. sciences. 2017 Nov 16:1-7
19. Hussain M. Mahmud I. pyMannKendall: a python package for non-parametric Mann Kendall family of trend tests. Journal of Open Source Software. 2019;4(39):1556. DOI: 10.21105/joss.01556
20. Sen PK. Estimates of the regression coefficient based on Kendall's tau. Journal of the American statistical association. 1968 Dec 1;63(324):1379-1389
21. Zwiers FW, Storch HV. On the role of statistics in climate research. International Journal of Climatology. 2004;24:665-680. DOI: 10.1002/joc.1027
22. Agbo EP, Ekpo CM, Edet CO. Trend Analysis of Meteorological Parameters, Tropospheric Refractivity, Equivalent Potential Temperature for a Pseudoadiabatic Process and Field Strength Variability. Using Mann Kendall Trend Test and Sens Estimate. arXiv preprint arXiv. 2020 Oct 9;2010:04575
23. Agbo EP, Ekpo CM. Trend Analysis of the Variations of Ambient Temperature Using Mann-Kendall Test and Sen’s Estimate in Calabar, Southern Nigeria. InJournal of Physics: Conference Series 2021 Jan (Vol. 1734, No. 1, p. 012016). IOP Publishing. . DOI: 10.1088/1742-6596/1734/1/012016

[1] 1. Statistical weather forecasting. In. Daniel S. Wilks S, editor. Elsevier: International Geophysics; 1995. pp. 159-232. DOI: 10.1016/S0074-6142(06)80042-2

[2] 2. Agbo EP, Ettah EB, Eno EE. The impacts of meteorological parameters on the seasonal, monthly, and annual variation of radio refractivity. Indian Journal of Physics. 2021; 95: 195-207. DOI: 10.1007/s12648-020-01711-9

[3] 3. Datta A, Si S, Biswas S. Complete Statistical Analysis to Weather Forecasting. In: Das A., Nayak J., Naik B., Pati S., Pelusi D. (eds) Computational Intelligence in Pattern Recognition. Advances in Intelligent Systems and Computing, 2020. vol. 999. Springer. Singapore. DOI: 10.1007/978-981-13-9042-5_65

[4] 4. Alhaji UU, Yusuf AS, Edet CO, Oche CO, Agbo EP. Trend analysis of temperature in Gombe state using Mann Kendall Trend test. Journal of Scientific Research and Reports. 2018;20(1):1-9. DOI: 10.9734/JSRR/2018/42029

[5] 5. Agbo EP, Ekpo CM, Edet CO. Analysis of the effects of meteorological parameters on radio refractivity, equivalent potential temperature and field strength via Mann-Kendall test. Theoretical and Applied Climatology. 2021 Feb; 143 (3): 1437-1456. DOI: 10.1007/s00704-020-03464-1

[6] 6. Akrami SA, El-Shafie A, Naseri M, Santo CAG. Rainfall data analyzing using moving average (MA) model and wavelet multi-resolution intelligent model for noise evaluation to improve the forecasting accuracy. Neural Comput & Applic. 2014;25:1853-1861. DOI: 10.1007/s00521-014-1675-0

[7] 7. Ajileye OO, Popoola OS, Kayode FF, Rabiu AB. Meteorological models for determination of surface radio refractivity over Nigeria. Current Journal of Applied Science and Technology. 2019;36(3):1-15. DOI: 10.9734/CJAST/2019/v36i330235

[8] 8. Srivastava RC, Pandey H. Estimating Angstrom-Prescott coefficients for India and developing a correlation between sunshine hours and global solar radiation for India. International Scholarly Research Notices. 2013;2013

[9] 9. Razmjoo A, Heibati SM, Ghadimi M, Qolipour M, Nasab JR. Using Angstrom-Prescott (A-P) Method for Estimating Monthly Global Solar Radiation in Kashan. J Fundam Renewable Energy Appl. 2016: 6(5). DOI: 10.4172/2090-4541.1000214

[10] 10. Klein SA. Calculation of monthly average insolation on tilted surfaces. Solar Energy. 1977;19(4):325-329

[11] 11. Ustrnul Z, Czekierda D. Application of GIS for the development of climatological air temperature maps: an example from Poland. Meteorological Applications: A journal of forecasting, practical applications, training techniques and modelling. 2005 Mar;12(1):43-50

[12] 12. Testud J, Chong M. Three-dimensional wind field analysis from dual-Doppler radar data. Part I: Filtering, interpolating and differentiating the raw data. Journal of Applied Meteorology and Climatology. 1983 Jul 1;22(7):1204-1215

[13] 13. Błażejczyk KR, Matzarakis AN. Assessment of bioclimatic differentiation of Poland based on the human heat balance. Geographia Polonica. 2007;80(1):63-82

[14] 14. Agbo EP, Ekpo CM. Trend Analysis of the Variations of Ambient Temperature Using Mann-Kendall Test and Sen’s Estimate in Calabar. Southern Nigeria

[15] 15. Edet CO, Eno EE, Ettah EB, Kamgba FA. Seasonal Variation of Radio Refractivity in Calabar, Nigeria. measurements. 2017 Jun;6(6)

[16] 16. Edet CO, Eno EE, Ettah EB. Monthly Variation of Radio Refractivity in Calabar, Nigeria. International Journal of Scientific Research Engineering & Technology. IJSRET: ISSN; 2017. pp. 2278-0882

[17] 17. Edet CO, Eno EE, Ettah EB. Effects of Variations in Meteorological Parameters of Atmospheric Pressure. Relative Humidity and Temperature on Radio Refractivity in Calabar

[18] 18. Kamgba FA, Edet CO, Njok AO. Effects of some meteorological parameters on wind energy potential in Calabar, Nigeria. Asian journal of physical and chemical. sciences. 2017 Nov 16:1-7

[19] 19. Hussain M. Mahmud I. pyMannKendall: a python package for non-parametric Mann Kendall family of trend tests. Journal of Open Source Software. 2019;4(39):1556. DOI: 10.21105/joss.01556

[20] 20. Sen PK. Estimates of the regression coefficient based on Kendall's tau. Journal of the American statistical association. 1968 Dec 1;63(324):1379-1389

[21] 21. Zwiers FW, Storch HV. On the role of statistics in climate research. International Journal of Climatology. 2004;24:665-680. DOI: 10.1002/joc.1027

[22] 22. Agbo EP, Ekpo CM, Edet CO. Trend Analysis of Meteorological Parameters, Tropospheric Refractivity, Equivalent Potential Temperature for a Pseudoadiabatic Process and Field Strength Variability. Using Mann Kendall Trend Test and Sens Estimate. arXiv preprint arXiv. 2020 Oct 9;2010:04575

[23] 23. Agbo EP, Ekpo CM. Trend Analysis of the Variations of Ambient Temperature Using Mann-Kendall Test and Sen’s Estimate in Calabar, Southern Nigeria. InJournal of Physics: Conference Series 2021 Jan (Vol. 1734, No. 1, p. 012016). IOP Publishing. . DOI: 10.1088/1742-6596/1734/1/012016

The Role of Statistical Methods and Tools for Weather Forecasting and Modeling

Weather Forecasting

Abstract

Keywords

Author Information

Emmanuel P. Agbo*

1. Introduction

2. Review of statistical tests/methodology

2.1 Statistical test for trend (Mann-Kendall trend test)

Table 1.

2.2 Regression analysis

2.2.1 Simple linear regression

Figure 1.

2.2.2 Multiple linear regression

2.3 Review of the application of simple linear regression analysis in climatology (the Angstrom-Prescott model)

Figure 2.

Table 2.

2.4 Calculus in climatology

Figure 3.

2.5 Python implementation for Mann-Kendall trend test

3. Results and discussion

3.1 Results

3.2 Discussion

Figure 4.

Figure 5.

3.2.1 Relationship between refractivity and meteorological parameters

Figure 6.

3.2.2 Application of multiple linear regression in climatology

Table 3.

Table 4.

Figure 7.

Table 5.

4. Conclusion

Acknowledgments

Conflict of interest

A.1 Appendix A

A.2 Appendix B

A.3 Appendix C

A.4 Appendix D

A.5 Appendix E

A.6 Appendix F

A.7 Appendix G

References

Continue reading from the same book

Weather Forecasting