The need for a good forecast estimate is imperative for managing flows in a supply chain. For this, it is necessary to make forecasts and integrate them into the flow control models, in particular in contexts where demand is very variable. However, forecasts are never reliable, hence the need to give a measure of the quality of these forecasts, by giving a measure of the forecast uncertainty linked to the estimate made. Different forecasting models have been developed in the past, particularly in the statistical area. Before going to our application on real industrial cases which highlights a prospective study of demand forecasting and a comparative study of sales price forecasts, we begin, in the first section of this chapter, by presenting the forecasting models, as well as their validation and monitoring.
- quality of forecasts
- demand forecasting
- selling price forecasting
For most companies, forecasting is a prerequisite for effective supply chain management. As explained by Lai et al. , forecasting is the basis of all production management systems. The entire supply chain is based on the data from forecast models.
In Ref. , the authors show the usefulness of forecasting and planning as a decision-making tool for organizing the supply chain across all horizons of time and at all levels.
In the academic field, forecasting occupies an important place. Given the primordial role of forecasting, we understand why many models have been developed since the beginning of the twentieth century. Research mainly developed from the 1950s onward with the use of mathematical models. A review of the literature was carried out by Stadtler . We find there the interest of forecasting for the global supply chain in order to integrate the different organizations and coordinate their flows in order to satisfy the end consumer.
The various sources for making these forecasts are located throughout the supply chain, including the commercial part of the business. It is the analysis of this source that will help build the basis for future forecasting. In the end, the sources used to build the forecasts are therefore multiple.
2. Application to prospective approach: modeling and forecasting demand using the ARIMA models
In the manufacturing sector, forecasting demand is one of the most crucial problems in inventory management ; it can be used in various operational planning activities during the production process: capacity planning and management of used product acquisitions .
For both types of push/pull supply chain processes, demand forecasting forms the basis of all CS planning. The “pull” processes in the SC are performed in response to the client’s request, while all the “push” processes are performed in anticipation of the client’s request . A business needs to know many factors related to forecasting demand. Some of these factors are listed below:
product delivery time;
planned advertising or marketing efforts;
state of the economy;
price reduction planned; and
actions undertaken by competitors.
Businesses need to understand these factors before they can choose an appropriate forecasting method as it can be difficult to decide which method is the most suitable for forecasting. Forecasting methods are classified into the following types: time series, causal, qualitative, and simulation .
A time series is considered to be a set of observations cited in chronological order . To forecast demand, time series forecasting models are based on historical data. These mathematical models used are based on the assumption that the future is an expansion of the past .
Numerous studies on demand forecasting by time series analysis have been carried out in several fields. They include demand forecasts for food sales , tourism , spare parts [4, 11], electricity [12, 13], automobiles , and some other goods and services [15, 16, 17].
In this section, we forecast the demand for a product in a food manufacturing operation based on real data, as well as the precision and characteristics of these forecasts.
Our study will be carried out according to the three stages of the Box-Jenkins approach: identification, estimation, and verification. We present the model relating to product demand from January 2010 to December 2015 as shown in Figure 1.
2.1 Identification of model
This refers to the initial preprocessing of the data to make it stationary and to the choice of p and q values that can be adjusted during model fitting.
We present the ACF and PACF diagrams of the series in Figures 2 and 3, respectively. We find that this series oscillates, respectively, around an average value, and its autocorrelation function decreases to zero point rapidly, which proves the stationarity of the time series studied.
Moreover, to assess whether the data come from a stationary process, we can perform the unit root test: Dickey-Fuller test for stationarity. After carrying out the test on the Xlstat software, the results are grouped in Table 1.
H0: The series has a unit root.
H1: The series does not have a unit root. The series is stationary.
|Tau (observed value)||−1.350|
|Tau (critical value)||−0.717|
The null hypothesis H0 cannot be rejected since the calculated p value is greater than the significance level α set at 0.05. We calculated the risk of rejecting the null hypothesis H0, while it is true. The risk is 84.38%.
In our study, we checked the stationarity of the series, and we noted from the ACF and PACF correlograms that our model cannot be pure RA or pure MA. Therefore, we tested several models to identify the most suitable for our series.
2.2 Estimation of model coefficients
The best model is as simple as possible and minimizes certain criteria, namely AIC criteria (Akaike criterion), SBC (Bayesian criterion of Schwarz), variance, and maximum likelihood [23, 24, 25]. The chosen model is that of ARIMA (0, 1, 1). For other models, either the Student “T-RATIO” test values are found in the range of ±1.96, or one of the values of the minimization criteria is higher than that found for the ARIMA model (1, 0, 1) with the constant value.
Table 2 presents the values of the different models. From this table, we choose the appropriate model on which we will base ourselves to make our forecasts.
|ARIMA (1,0,2)||ARIMA (2,0,2)||ARIMA (1,0,1)||ARIMA (1,0,0)||ARIMA (0,0,1)||ARIMA (1,0,1) without constant|
It is clear from Table 2 that the ARIMA model (1,0,1) is selected because all the coefficients are significantly different from 0 according to the Student test (|T-RATIO|) ≥ 1.96) with an acceptable level of adjustment.
The model residue is stationary and follows a white noise process in the range of ±40. The residue histogram shows whether the distribution of residues approximates a normal distribution. In our case, we have residues that distribute relatively normal around zero and with a relatively low dispersion at a 5% risk.
The chosen model parameters are presented in Table 3.
The developed model is given by Eq. 1.
, : sales of periods t and t–1, respectively.
: residuals of periods t and t–1, and constitute a white nose.
α1, θ1: coefficients of autoregressive and moving average processes, respectively.
2.3 Accuracy of ARIMA (1, 0, 1) model
In order to assess the accuracy of the developed model, we compare the experimental and simulated sales during the same period. This comparison is drawn up in Table 4 and reveals that the model selected has great precision and an ability to simulate dynamic sales behavior. Therefore, this model can be used to analyze and model the demand in this food manufacturing.
Figure 4 shows that the model is validated since the predicted demand fluctuates around the adjustment and the forecast demand, which remained between the upper limit and the lower limit.
The error varies, but it is within the tolerance range. In order to minimize this error, we are opting for other approaches in our future work.
Once the appropriate model is defined and validated, we must do the forecasting, using the IBM SPSS forecasting. Table 4 and Figure 5 present the results of the sales forecasts that we obtained by applying our ARIMA model (1, 0, 1) for the next 10 months from January 2016 to October 2016.
The chosen model can therefore be used to model and forecast future demand in this food manufacturing. However, each time we have to feed historical data with new data to enrich it and thus improve the new model and forecasts.
The accrue forecasts presented facilitated the production decision in this business. Indeed, the model allowed us to forecast demand and make precise forecasts. Once we have a forecast of demand, it will be much easier to clearly plan the production and thus eliminate the heavy cost losses.
3. Application to comparative approach: comparison of the quality of forecasts obtained in the context of forecasting selling prices
Our second industrial application is devoted to a modeling study and comparative forecast of sales prices using ARIMA models, artificial neural networks, and support vector machines.
In this section, we will model the actual fuel price data named “SSP” in order to make important predictions to determine future selling prices. The model shown in Figure 6 is based on the price of “SSP” fuel in a petroleum production from January 2012 to December 2016.
3.1 Forecasting using ARIMA models
3.1.1 Determination of the differentiation parameter
The series has a large number of positive shifts for the autocorrelation function, so it must be differentiated.
The next step is to differentiate the series. You have to differentiate it enough to make it immobile but not drag with an excessive differentiation, which will cause a loss of information and therefore unstable models. In our case, we just had to take d = 1 because of the linearity of the trend.
Besides, to decide if the data come from a stationary process or not, we can carry out the unit root test: Dickey-Fuller test for stationarity. After performing the test on the Xlstat software, we grouped the results in Table 5.
H0: The series has a unit root.
H1: The series does not have a unit root. The series is stationary.
|Tau (observed value)||−4.0325|
|Tau (critical value)||−0.7648|
The null hypothesis H0 must be rejected, and the alternative hypothesis H1 must be accepted since the calculated p value is less than the significance level α set at 0.05. We calculated the risk of rejecting the null hypothesis H0, while it is true. The risk is less than 0.92%.
We conclude that our model will have an order of differentiation d = 1. We also note that the T-RATIO for the constant of model μ is less than 2 in absolute value. We must therefore deduct it from the model before determining the parameters p and q.
3.1.2 Determination of the autoregressive parameter
We can clearly see from Figures 9 and 10 that the partial autocorrelation has a significant peak at offset 2, and we can then deduce that the differentiated series comprises an autoregressive signature. The parameter p is therefore equal to 1.
However, the T-RATIO for the autoregressive parameter φ1 is lower in absolute value than 2. So, we cannot retain this model. Similarly, the ARIMA model (2, 1, 0) presents the autoregressive parameters whose T-RATIO is less than 2 in absolute value.
3.1.3 Determination of the moving average parameter
Now, the T-RATIO for the moving average parameter θ1 is lower in absolute value than 2. So we cannot retain this model. Similarly, the ARIMA model (0,1,2) presents moving average parameters whose T-RATIO is less than 2 in absolute value.
3.1.4 Mixed ARIMA model
After several iterations and tests, we concluded that only the ARIMA model (1,1,1) had higher T-RATIOS in absolute value than 2. This is the model we should use to make forecasts.
With the coefficients obtained now, we can write the equation of the model retained as follows:
Table 6 lists the forecasts obtained for the first quarter of 2017.
|Fortnight||Real price||Model||% error|
The graph in Figure 11 proves the adequacy of the ARIMA model (1,1,1) developed, which is very close to the real model.
Table 2 allows us to admit that the chosen model can be used to model and forecast future sales in this petroleum production.
3.1.5 Forecasting using artificial neural networks
The goal here is to develop a relationship between experimental data collected from authentic sources to estimate the selling prices of fuel. We are trying to apply RBF radial-based neural networks, which are based on machine learning approaches due to the complex relationships between the input parameter and the output parameter. In this section, we present the modeling approach using this technique to precisely compare it with the ARIMA model used in the previous section.
3.1.6 Model development
The radial basis ANN model (comprising two layers) is trained for implementing the back propagation algorithm to minimize the mean squared error with one parameter (time) as the input and the desired output (fuel selling price). As presented on the visualization of the network shown in Figure 12, the first layer has radial basis transfer functions with the maximum number of 80 neurons, and the second layer has a linear transfer function, in order to build a consistent model for providing accurate forecasts .
Feature selection is one of the core concepts in machine learning, which hugely impacts the performance of our model. Irrelevant or partially relevant features can negatively impact model performance. Feature selection and data cleaning should be the first and most important step of our model designing. However, in our case, this step may be omitted as long as our point cloud is significant. Subsequently, the dataset was randomly divided into two disjoint subsets of training set (60% of total dataset), which help us train our dataset to find the adequate model and testing set (40% of total dataset) to validate the model found. The training set is applied in order to develop the network. After the training phase, the reliability and accuracy of the network were perused with the test data. Besides, in our study, we implemented radial basis network of the MATLAB toolbox (i.e., “nwrb”). Furthermore, the Gaussian function is the main kernel function implemented here with the width parameter of 1 .
After executing the learning phase, we obtain Figures 13 and 14 that represent the learning of our database. Figure 15 represents the error in the training phase. During the test phase, we gave values to the input variable to visualize the results of the output and thus simulate our model.
3.1.7 Error optimization
Optimizing the error consists of a compromise to be made between the various parameters of the network, namely the speed, the objective, the number of neurons, and the number of neurons to be added to the hidden layer. This compromise is made on the basis of several tests of the different combinations carried out. Some of these combinations are presented in Table 7.
After making different combinations, we find that the error is considerable for all the compromises. Consequently, no model can adapt to the time series, especially in the long term. The reason behind this result is not only the large fluctuations in the selling price of the fuel but also the percentage of the total dataset used in the training stage (60%). In fact, this percentage will not allow us to predict 40% of the total dataset. We will have to increase the percentage of training. In the next step, we will consider 80% of the total dataset for the training phase and 20% for testing the model. Table 8 summarizes the different combinations .
|Parameters||Relative error (%)|
The combination that minimizes the error is therefore:
goal = 0.01;
spread = 1;
MN = 20; and
DF = 30.
We can conclude that learning with 80% of the database gives increased results in comparison with the other case (learning with 60%) since the error is minimized. The output is calculated and presented in Table 9.
|Input (time)||Real value of output||Predicted value of output||% error|
From Table 9, we can clearly see that the selected model can be used to model and forecast future sales in this petroleum manufacturing. As a last part, we will use the methodology of support vector machines to see that this is going to give a result.
3.1.8 Forecasting using support vector machines (SVMs)
The aim of our current work is to develop a relationship between experimental data collected from authentic sources to estimate the selling price of fuel. We are trying to apply support vector machines based on machine learning approaches because of the complex relationships between the input parameter and the output.
We prepared our database and then developed the program in Python language, which will be compiled on Spyder software.
We imported our dataset, which is the actual price of our fuel studied, created, and indexed the location of values from the database. Then, we standardized the data so that it corresponds to the learning process that will be carried out using the SVR function. In fact, we have divided our database into a learning part and another for the test. We tried two main distributions: (1) 60% of our database used in the learning phase and 40% used in the testing phase and (2) 80% of our database used in the learning phase and 20% used in the testing phase. We have kept the second distributions based on the results obtained after compiling the program. After that, we learned “Train X” and “Train Y” and executed the test to finally calculate the average of the errors and obtained the values predicted in the test phase, which are grouped in Figure 16.
The average error is equal to 26.882361, which represents 2.53%. The error graph is shown in Figure 17.
It is clear that the model chosen can be used to model and forecast future sales for this petroleum industry since the error observed (2.53%) respects the allowable margin of error set by the company at 3%. In addition, the SVR function is a useful tool, which guarantees good precision and minimizes the error compared to the ARIMA model.
In the first industrial application of this chapter, we modeled demand using ARIMA models. The model we have obtained will allow the company to forecast demand and make precise forecasts.
In the second application, we studied the selling prices of the SSP via three methodologies: ARIMA, RBF, and SVMs.
First, we developed an ARIMA model based on historical data. This study allowed us to determine the ARIMA model (1,1,1), which gives gasoline price forecasts close to the margin to reach for the first quarter of the current year with an average margin of error 2.855%. Second, we used the RBF technique to improve the modeling and forecasting of the selling price of fuel. It was found that this technique has proven its strength manifested in the error, which has been further minimized: 1.95% instead of 2.85% for the ARIMA model. Finally, we used the SVM function. The forecasts made are quite satisfactory because they respect the margin tolerated by the company. The error of the SVM function is around 2.53%.
As a summary, the SVM function has proven its strength manifesting itself in the error, which has been further minimized: 2.53% instead of 2.885% for the ARIMA model, but which remains higher than the error obtained using the RBF technique.
For most companies, forecasting is a prerequisite for effective supply chain management. Forecasting is the basis of all production management systems. The entire supply chain is based on data from forecast models.
In this chapter, we have presented the study of forecasting demand and selling prices in industrial companies. We also carried out a comparative study aimed at minimizing the error to guarantee increased forecasts.
In the first part, we modeled the future demand for a food company using ARIMA models based on the Box-Jenkins methodology. The model we have obtained will allow the company to forecast demand and make precise forecasts. We can clearly see that the chosen model can be used to model and forecast future demand for this agribusiness, but each time we need to populate the historical data with the new data.
Second, we carried out a study, which consists in comparing the quality of the forecasts obtained in the context of forecasting selling prices. We presented the application of three different methodologies allowing us to make sales forecasts in a company operating in the petroleum sector.
We have developed an ARIMA model based on historical data. This study allowed us to determine the optimal autoregressive, moving average, and differentiation parameters in order to make predictions. We found that the ARIMA model (1,1,1) gives gasoline price forecasts close to the margin to reach for the first quarter of the current year with an average margin of error of 2.855% included within the margin of error tolerated by the company (plus or minus 3% as margin of error). In addition, the hypothesis that the residues are white Gaussian noise has always been verified.
Then, we tried forecasting selling prices via the RBF technique in order to improve the modeling and forecasting done before. To do this, we have developed an RBF network based on historical data to come up with conclusions in terms of superiority of forecast performance. Consequently, the use of this technique has proven itself and has allowed us to minimize the error, which is 1.95% versus 2.85% for the ARIMA model.
Finally, we studied the SSP selling prices via the SVM function. We prepared our database and then developed the program in Python language, which will be compiled on Spyder software. The forecasts made are quite satisfactory with regard to the constraint imposed by the company (plus or minus 3% margin of error). The error of the SVM function is around 2.53%. Consequently, the SVM function has proven its strength manifesting itself in the error, which has been further minimized: 2.53% instead of 2.855% for the ARIMA model, but which remains higher by comparing it with the error obtained if we had opted for neural networks.