Open access peer-reviewed chapter - ONLINE FIRST

Investigation of Fuzzy Inductive Modeling Method in Forecasting Problems

By Yu. Zaychenko and Helen Zaychenko

Submitted: November 5th 2018Reviewed: April 15th 2019Published: May 27th 2019

DOI: 10.5772/intechopen.86348

Downloaded: 61

Abstract

This paper is devoted to the investigation and application of fuzzy inductive modeling method group method of data handling (GMDH) in problems of forecasting in the financial sphere. GMDH method belongs to self-organizing methods and allows to discover internal hidden laws in the appropriate object area. The advantage of GMDH algorithms is the possibility of constructing optimal models. In the generalization of GMDH in case of uncertainty, new method fuzzy GMDH is described which enables to construct fuzzy models almost automatically. The algorithms of fuzzy GMDH for different membership functions are considered. The extensions of fuzzy GMDH for different partial descriptions—orthogonal polynomials of Chebyshev and trigonometric polynomials of Fourier—are considered. The problem of adaptation of fuzzy models obtained by FGMDH is considered, and the corresponding adaptation algorithm is described. The experimental investigations of the suggested FGMDH in the problem of forecasting macroeconomic indicators of Ukraine are carried out, and comparison with classic GMDH and neural network BP is performed.

Keywords

  • fuzzy GMDH
  • orthogonal partial descriptions
  • model adaptation
  • forecasting

1. Introduction

One of the most important problems in the sphere of economy and finance is the problem of forecasting economic and financial processes. The distinguishing properties of these processes are the following:

  1. The form of functional dependence is unknown, and only model class is determined.

  2. Short data samples.

  3. Time series xitin general case is nonstationary.

In this case the application of traditional methods of statistical analysis (e.g., regression analysis) is impossible, and it’s necessary to apply methods based on computational intelligence (CI). To this class belongs group method of data handling (GMDH) developed by Ivakhnenko [1, 2] and extended by his colleagues. GMDH method belongs to self-organizing methods and allows to discover hidden laws in the appropriate object area. The advantage of GMDH algorithms is the capability of constructing optimal models.

But classic GMDH has the following shortcomings:

  1. GMDH utilizes least squared method (LSM) for finding the model coefficients, but matrix of linear equations may be close to degenerate, and the corresponding solution may appear non-stable. Therefore, the special methods for its regularization should be used.

  2. GMDH doesn’t work in the case of qualitative or fuzzy input data.

Therefore in the last 10 years, the new variant of GMDH—fuzzy GMDH—was developed and extended which may work with fuzzy input data and is free of classical GMDH drawbacks [3, 4, 5].

Fuzzy GMDH is also based on the same principles as classical GMDH but construct fuzzy models.

The main goals of this paper are to investigate different modifications of FGMDH, analyze their properties, and investigate its efficiency as compared with classical GMDH in forecasting problems.

2. Problem formulation

A set of initial data is given inclusive of input variables X1X2XNand output variables Y1Y2YN, where X=x1x2xnis -n-tuple vector, N is a number of observations, and input data may be incomplete or fuzzy, in particular given in interval form. The task is to construct an adequate fuzzy forecasting model Y=Fx1x2xn, and besides, the obtained model should have the minimal complexity.

2.1 Principal ideas of GMDH: fuzzy model construction

As it well known, the drawbacks of GMDH are the following [3, 4]:

  • GMDH utilizes LSM for finding the model coefficients, but matrix of linear equations may be close to degenerate, and the corresponding solution may appear non-stable and very volatile. Therefore, the special regularization methods should be applied.

  • After application of GMDH point-wise estimations is obtained, but in many cases, it’s desirable to find interval value for coefficient estimates.

  • GMDH doesn’t work in the case of incomplete, qualitative, or fuzzy input data.

Therefore, in the last 10 years, the new variant of GMDH—fuzzy GMDH—was developed and improved which may work with fuzzy and qualitative input data and is free of classical GMDH drawbacks [3].

As it is well known, GMDH method is based on the following principles [1, 2, 3]:

  1. The principle of multiplicity of models

  2. The principle of external complement which means that the whole sample should be divided into two parts—training subsample and test subsample

  3. The principle of self-organization

  4. The principle of freedom of choice

Fuzzy GMDH is also based on these principles but construct fuzzy models. Let’s consider its main ideas.

In works [3, 4, 5], the linear interval regression model was considered:

Y=A0Z0+A1Z1++AnZnE1

where Aiis a fuzzy number of triangular form described by a pair of parameters Ai=αici, where αiis interval center, ciis its width, and ci0, Zi is the input variables.

Then Y is a fuzzy number, parameters of which are determined as follows:

The interval center

αy=αizi=αTzE2

The interval width

cy=cizi=cTzE3

For example, for the partial description (PD) of the kind

fxixj=A0+A1xi+A2xj+A3xixj+A4xi2+A5xj2E4

it’s necessary to substitute in the general model (1)

z0=1z1=xiz2=xjz3=xixjz4=xi2z5=xj2.

Let the training sample be z1z2zM, y1y2yM. Then for the model (1) to be adequate, it’s necessary to find such parameters αicii=1,n¯, which satisfy the following inequalities:

αTzkcTzkykαTzk+cTzkyk,k=1,M¯E5

Let’s formulate the basic requirements for the linear interval model of a kind (4).

It’s necessary to find such values of the parameters αiciof fuzzy coefficients for which:

  1. Real values of the observed outputs ykshould drop in the estimated interval for Yk.

  2. The total width of the estimated interval for all sample points should be minimal.

These requirements lead to the following linear programming (LP) problem [3, 4]:

min(C0M+C1k=1Mxki+C2k=1Mxkj+C3k=1Mxkixkj++C4k=1Mxki2+C5k=1Mxkj2E6

under constraints

a0+a1xki+a2xkj+a3xkixkj+a4xki2+a5xkj2(C0+C1xki+C2xkj++С3xkixkj+С4xki2+С5xkj2)ykE7
a0+a1xki+a2xkj+a3xkixkj+a4xki2+a5xkj2+(С0+C1xki+C2xkj++С3xkixkj+С4xki2+С5xkj2)yk,Cp0,p=0,5k=1,M¯E8

where k is a number of a point.

As one can easily see, the task (6)(8) is a LP problem. However, the inconvenience of the model (6)(8) for the application of standard LP methods is that there are no constraints of non-negativity for variables αi. Therefore for its solution, it’s reasonable to pass to the dual LP problem by introducing dual variables δkand δk+M, k=1,M¯. Using simplex method after finding the optimal solution for the dual problem, the optimal solutions αiciof the initial direct problem will be also found.

3. Description of fuzzy GMDH algorithm

Let’s present the brief description of the algorithm FGMDH [3, 4].

  1. Choose the general model type by which the sought dependence will be described.

  2. Choose the external criterion of optimality (criterion of regularity or non-biasedness).

  3. Choose the type of partial descriptions (e.g., linear or quadratic one).

  4. Divide the sample into training Ntrainand test Ntestsubsamples.

  5. Put zero values to the counter of model number k and to the counter of iteration number r.

  6. Generate a new partial model fk(4) using the training sample. Solve the LP problem (6)(8), and find the values of parameters αi, ci.

  7. Calculate the value of external criterion (Nubkror δk2r) at the test sample.

  8. k=k+1. If k>CN2for r = 1 or k>CF2for r >1, then k=1, r=r+1, and go to step 9; otherwise go to step 6.

  9. Calculate the best value of the criterion for models of rth iteration δ2ror Nubr. If r=1, then go to step 6; otherwise go to step 10.

  10. If NubrNubr1εor δ2rδ2r1, then go to step 11; otherwise select F best models, assign r=r+1and k=1, go to step 6, and execute (r + 1)th iteration.

  11. Select the best model of the previous iteration using external criterion and moving back by its connections and successively passing all the previous rows find analytical form the constructed model.

4. Analysis of different membership functions

In the first papers devoted to fuzzy GMDH [3], the triangular membership functions (MFs) were considered. But as fuzzy numbers may also have the other kinds of MF, it’s important to consider the other classes of MF in the problems of modeling using FGMDH. In paper [4] fuzzy models with Gaussian and bell-shaped MFs were investigated.

Consider a fuzzy set with Gaussian MF:

μBx=e12x12c2E9

Let the linear interval model for partial description of FGMDH take the form (4). Then the problem is formulated as follows:

Find such fuzzy numbers Bi, with parameters aici, that:

  • The observation ykwould belong to a given estimate interval for the set Ykwith degree not less than α, 0<α<1.

  • The width of estimated interval of the degree αwould be minimal.

In [4, 6] it was shown that the problem of finding optimal fuzzy model will be finally transformed to the following LP problem:

min(C0M+C1k=1Mxki+C2k=1Mxkj+C3k=1Mxkixkj++C4k=1Mxki2+C5k=1Mxkj2E10

under constraints

a0+a1xki++a5xkj2+С0+C1xki++С5xkj22lnαyka0+a1xki++a5xkj2С0+C1xki++С5xkj22lnαykk=1,M¯E11

To solve this problem like the case with triangular MF, it’s reasonable to pass to the dual LP problem of the form

maxk=1Mykδk+Mk=1MykδkE12

with constraints of equalities and inequalities

k=1Mδk+Mk=1Mδk=0,k=1MXkiδk+Mk=1MXkiiδk=0k=1MXkj2δk+Mk=1MXkj2δk=0E13
k=1Mδk+k=1Mδk+MM2lnαk=1MXkiδk+M+k=1MXkiδkk=1MXki2lnαk=1MXkj2δk+M+k=1MXkj2δkk=1MXkj22lnαE14
δk0,k=1,2M¯E15

Analyzing dual LP program (12)(15), it’s easy to notice that this problem is always solvable as there trivial solution δk=1k=1,2M¯always exists. Therefore the initial problem (10) and (11) also always has solutions with any data.

Thus, fuzzy GMDH allows to construct fuzzy models and has the following advantages:

  1. The problem of optimal model determination is transformed to the problem of linear programming, which is always solvable.

  2. As the result of method work, the interval regression model is being built.

5. Fuzzy GMDH with different partial descriptions: orthogonal polynomials

As it is well known from the general GMDH theory, model pretenders are generated on the base of so-called partial description—elementary models of two variables. Usually as partial descriptions, linear or quadratic polynomials are used. The alternative to this class of models is application of orthogonal polynomials. The choice of orthogonal polynomials as partial descriptions is determined by the following advantages:

  • Due to orthogonal property, the determination of polynomial coefficients goes faster than for non-orthogonal polynomials.

  • The coefficients of polynomial approximating equation don’t depend on the real degree of initial polynomial model, so if a priori the real polynomial degree is not known, one may calculate the polynomials of various degrees, and by this property the coefficients obtained for polynomials of lower degrees remain the same after transfer to higher polynomial degrees. This property is the most important during investigation of real degree of approximating polynomial.

5.1 Chebyshev’s orthogonal polynomials

Chebyshev’s orthogonal polynomials in general case have the following form [5]:

Fνξ=Tνξ=cosνarccosξ,1ξ1E16

These polynomials have the following orthogonality property:

11TμξTνξ1ξ2=0ifμξ;π2ifμ=ξ0;πifμ=ξ=0.E17

where 1ξ2is a weighting coefficient ωξin the Eq. (17).

The approximating Chebyshev’s orthogonal polynomial for y¯is obtained on the base of function S minimization:

S=11ωξyξi=0mbiTiξ2E18

where from (18) we obtain the following expression for coefficients:

bk=1π11yξ1ξ2,k=02π11yξTkξ1ξ2,k0E19

Hence, the approximating equation takes the form

y¯ξ=k=0mbkTkξE20

As it may be readily seen from the presented expressions, coefficient bkin Eq. (19) doesn’t depend on the choice of degree m. Thus, the variable m doesn’t demand recalculation of bj,jm, while such recalculation is necessary for non-orthogonal approximation.

The best degree mof approximating may be obtained on the base of hypothesis that investigation results yi,i=1,2,,rhave independent Gaussian distribution in the bounds of some polynomial function y¯of definite degree, for example, m+μ, where

y¯m+μxi=j=0m+μbjxijE21

and a dispersion σ2of distribution yy¯don’t depend on μ.

It’s clear that for very small m(m=0,1,2,…), σm2decreases as m grows.

As in accordance with previously formulated hypothesis, dispersion doesn’t depend on μ; therefore, the best degree mis a minimal m, for which σmσm+1.

For determining mit’s necessary to calculate the approximating polynomials of various degrees. As coefficients bjin Eq. (20) don’t depend on μ, the determination of the best degree of polynomial is accelerated.

Let us have the forecasted variable Yand input variables x1,x2,xn. Let’s search the relation between them in the following form:

Y=A1f1x1+A2f2x2++AnfnxnE22

where Aiis a fuzzy number of triangular type given as Ai=αici,

functions fiare determined so [5, 6]

fixi=j=0mibijTjxiE23

The degree miof function fiis determined using hypothesis defined in the preceding section. So if we denote zi=fixi, we’ll get linear interval model in its classical form.

5.2 Investigation of trigonometric polynomials as partial descriptions

Let function fxbe periodic with period 2πdefined at the interval ππ, and its derivative fxis also defined at the interval ππ. Then the following equality holds

Sx=fx;xππE24

where

Sx=a02+j=1ajcosjx+bjsinjxE25

Coefficients ajand bjare calculated by Euler formulas:

aj=1πππfxcosjxdx;bj=1πππfxsinjxdx;E26

5.3 Definition

A trigonometric polynomial of the degree M is called the following polynomial:

TMx=a02+j=1Majcosjx+bjsinjxE27

The following theorem is true stating that exists such M,where2M<N, which minimizes the following criterion:

j=1NfxiTMxi2E28

Hence the coefficients of corresponding trigonometric polynomial are determined by formulas

aj=2Ni=1Nfxicosjxi;bj=2Ni=1Nfxisinjxi;E29

Let it be the forecasted variable Yand input variables x1,x2,xn. Let’s search the dependence among them in the form

Y=A1f1x1+A2f2x2++AnfnxnE30

where Aiis a fuzzy number of triangular type given as Ai=αici, functions fiare determined in such a way:

fixi=TMixiE31

The degree Miof function fiis determined by the theorem described in the preceding section. Therefore, if we assign zi=fixi, the linear interval model will be obtained in its classical form.

6. Adaptation of fuzzy GMDH models

While forecasting by self-organizing methods (fuzzy GMDH, in particular), the problem of adaptation arises in the case of the training sample size increase when it’s needed to correct the obtained model in accordance with new available data. Taking into account new information obtained while forecasting adaptation may be done by two approaches. The first one is to correct parameters of a forecasting model with new data assuming that model structure didn’t change. The second approach consists in adaptation of not only model parameters but its optimal structure as well. This way demands the repetitive use of full GMDH algorithm and is connected with huge volume of calculations.

The second approach is used if adaptation of parameters doesn’t provide good forecast and the new real output values don’t drop in the calculated interval for its estimate.

In our work the first approach is used based on adaptation of FGMDH model parameters with new available data. Here the recursive identification methods are preferably used, especially the recursive LSM. In this method the parameter estimates at the next step are determined on the base of estimates at the previous step, model error, and some information matrix which is modified during all estimation process and therefore contains data which may be used at the next steps of adaptation process [5].

Hence, model coefficient adaptation will be simplified substantially. If we store information matrix obtained while identification of optimal model using fuzzy GMDH, then for model parameters adaptation, it will be enough to fulfill only one iteration by recursive LSM method.

6.1 The application of recurrent LSM for model coefficients adaptation

Consider the following model:

yk=θTΨk+vkE32

where ykis a dependent (output) variable, Ψkis a measurement vector, vkare random disturbances, and θis a parameter vector to be estimated.

The parameters estimate θat the step N is performed due to such formula [5, 6]:

θN=θN1+γNyNθTN1ΨNE33

where γNis a coefficient vector which is determined by formula

γN=PN1ΨN1+ΨTNPN1ΨNE34

where PN1is so-called an information matrix, determined by formula

PN1=PN2PN2ΨN1ΨTN1PN21+ΨTN1PN2ΨN1E35

As one can easily see in (35), the information matrix may be obtained independent on parameter estimation process and parallel to it. The adaptation of two parameter vectors θ1T=α1αm;θ2T=C1Cm;is performed in such a way using the formulas (35)

θ1N=θ1N1+γ1NyNθ1TN1Ψ1Nθ2N=θ2N1+γ2NycNθ2TN1Ψ2NE36
ycN=yNθ1TN1Ψ1N

where Ψ1T=z1zm;Ψ2T=z1zm.

7. Experimental investigations of FGMDH in forecasting

The goal of experiments was the forecasting of macroeconomic indicators of Ukraine and estimating of efficiency of suggested FGMDH. In experiments, the database was utilized which contains monthly values of 24 macroeconomic indicators of Ukrainian economy, since July 1995 till 2013. As forecasting variables consumer price index (CPI) and gross national product (GNP) were chosen.

While constructing forecasting models, the technique of sliding window was utilized, whose size was determined automatically by regression analysis. For determination of input variables significant for forecasting, the methods of regression analysis were also used.

The following experiments were performed:

  1. Forecasting model construction with application of different membership functions: triangular, Gaussian, and bell-wise one

  2. For macroeconomic indicators (CPI and GNP) forecasting the construction of forecasting models using different partial descriptions—classic Chebyshev’s polynomials and trigonometric polynomials.

  3. For adaptation of models, the algorithm of stochastic approximation and recurrent least squared method (RLSM) were applied.

  4. Comparative analysis of the suggested algorithms with classic GMDH and neural networks (NN), in particular neural network backpropagation, was performed.

7.1 Comparison of different membership functions

The experimental investigations of fuzzy forecasting models were carried out with following MF: triangular, Gaussian, and bell-wise. As accuracy criteria RMSE was chosen. RMSE values while forecasting CPI are presented in Figure 1.

Figure 1.

Forecasting accuracy of PCI with different MF.

As one can see, the most efficient for constructing linear interval models is application of bell-wise membership functions for fuzzy coefficients, on the second place are Gaussian MFs, and the worst forecasting accuracy was achieved with triangular MF. In the next experiment, the task was to forecast GPD values.

In Figure 2 the obtained RMSE values for forecasting GNP are presented.

Figure 2.

Forecasting accuracy of GNP with different MF.

As one can see, the results are practically the same as in the previous experiment. The best accuracy was attained with bell-wise MF.

7.2 Comparison of different partial descriptions

In the next series of experiments, the investigations of FGMDH models with the following partial descriptions were carried out: quadratic polynomials, Chebyshev’s polynomials, trigonometric polynomials, and ARIMA models. In Figure 3 accuracy of forecasting PCI is presented with different PD.

Figure 3.

Forecasting accuracy of PCI for different PD.

As we can see, the best results are obtained with models which use trigonometric polynomials as PD. Somewhat worse are results with classic quadratic polynomials. And the worst turned out to be ARIMA models as PD. It may be explained by the fact that ARIMA models are functions of one variable. That is a serious drawback of such models.

7.3 Comparison of crisp and fuzzy GMDH

For more comprehensive efficiency comparison of crisp and fuzzy GMDH, existing implementation of GMDH was extended by inclusion of new types of PD orthogonal polynomials: Chebyshev’s and trigonometric and ARIMA models as PD. As adaptation algorithm stochastic approximation and recurrent LSM were implemented. In Figures 4 and 5, the mean RMSE values for crisp and fuzzy GMDH in the whole range of data variation are presented for different types of PD without adaptation and with adaptation algorithms.

Figure 4.

Forecasting accuracy of classical and fuzzy GMDH for PCI.

Figure 5.

Forecasting accuracy of classical and fuzzy GMDH for GDP.

As one can easily see from presented results, the fuzzy algorithm GMDH shows better forecasting accuracy than classic GMDH for all adaptation algorithms.

So the results of experiments have confirmed indisputable advantages of fuzzy GMDH over classic GMDH for problem of forecasting macroeconomic indicators. In the next experiments, the comparison of fuzzy GMDH with results of neural network (NN) backpropagation was performed. The final results—MSE values on five forecasting points while forecasting CPI and GNP—are presented in Table 1.

Adaptation algorithmWithout adaptationStochastic approximationRLSM
CPIGNPCPIGNPCPIGNP
Triangle MF + quadratic polynomial0.308530.30.184330.00.173311.9
Gaussian MF + quadratic polynomial0.294531.3
Bell-wise MF + quadratic polynomial0.268497.9
Triangular MF + Chebyshev’s polynomial0.403621.40.341458.10.337377. 2
Triangular MF + Laguerre polynomial0.372589.50.264442.90.293378.5
Triangular MF + trigonometric polynomial0.261537.70.185347.90.165331.9
Triangular MF + ARIMA model0.862704.30.683513.50.597472.6
GMDH + quadratic polynomial0.343596.70.204428.20.192369.2
GMDH + Chebyshev’s polynomial0.425641.40.351473.20.347398.4
GMDH + Laguerre polynomial0.396598.50.292459.00.274376.4
GMDH + trigonometric polynomial0.291574.80.182349.50.177332.2
GMDH + ARIMA model0.902728.40.749518.70.714498.3
NN backpropagation10.954792.3
NN backpropagation20.741668.6

Table 1.

Forecasting accuracy (MSE) for different forecasting methods.

Neural network constructed with Neural Networks Toolbox 4.0.6 (MathWorks).


Neural network constructed with Alyuda Forecaster 1.6 (Alyuda Research).


Summing the experimental results, the following conclusions were made:

  1. Forecasting accuracy of fuzzy GMDH algorithms are, in a whole, better than of non-fuzzy GMDH.

  2. Forecasting accuracy of non-fuzzy and fuzzy GMDH algorithms are better than that of NN backpropagation. Modification of membership functions doesn’t lead to significant changes of forecasting quality, but the best results were obtained with bell-wise and Gaussian MF.

  3. The best forecasting accuracy for considered problems was obtained with models of fuzzy GMDH using quadratic and trigonometric partial descriptions.

  4. The best adaptation algorithm for fuzzy GMDH models is recurrent RLSM.

In [7, 8] the generalization of fuzzy GMDH for case when input data are also fuzzy was considered. Then a linear interval regression model takes the following form:

Y=A0Z0+A1Z1++AnZn,

Consider the case of symmetrical membership function for parameters Ai, so they can be described by the pair of parameters (ai, ci), where

A¯i=aici, A¯i=ai+ci, ciis the interval width, ci0,

and Ziis input variable which is also a fuzzy number of triangular shape, defined by three parameters Zi¯ZiZi¯, where Zi¯is a lower border, Ziis a center, and Zi¯is an upper border of fuzzy number.

It was shown that corresponding model is also LP problem, and corresponding algorithm FGMDH was developed for such case [7, 8].

8. Conclusions

In this paper fuzzy inductive modeling method FGMDH is considered.

The algorithms of FGMDH with different membership functions and different partial descriptions, including orthogonal polynomials, were presented and analyzed.

The experimental investigations of GMDH and fuzzy GMDH in problems of macroeconomic index forecast in Ukrainian economy were carried out.

The comparative investigations of FGMDH with ARIMA and neural network backpropagation were performed.

Experimental result analysis has confirmed the high accuracy of fuzzy GMDH in problems of forecasting in macroeconomy.

Download

chapter PDF

© 2019 The Author(s). Licensee IntechOpen. This chapter is distributed under the terms of the Creative Commons Attribution 3.0 License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

How to cite and reference

Link to this chapter Copy to clipboard

Cite this chapter Copy to clipboard

Yu. Zaychenko and Helen Zaychenko (May 27th 2019). Investigation of Fuzzy Inductive Modeling Method in Forecasting Problems [Online First], IntechOpen, DOI: 10.5772/intechopen.86348. Available from:

chapter statistics

61total chapter downloads

More statistics for editors and authors

Login to your personal dashboard for more detailed statistics on your publications.

Access personal reporting

We are IntechOpen, the world's leading publisher of Open Access books. Built by scientists, for scientists. Our readership spans scientists, professors, researchers, librarians, and students, as well as business professionals. We share our knowledge and peer-reveiwed research papers with libraries, scientific and engineering societies, and also work with corporate R&D departments and government entities.

More About Us