Analysis of Financial Time Series in Frequency Domain Using Neural Networks

Developing new methods for forecasting of time series and application of existing techniques in different areas represents a permanent concern for both researchers and companies that are interested to gain competitive advantages. Financial market analysis is an important thing for investors who invest money on the market and want some kind of security in multiplying their investment. Between the existing techniques, artificial neural networks have proven to be very good in predicting financial market performance. In this chapter, for time series analysis and forecasting of specific values, nonlinear autoregressive exogenous (NARX) neural network is used. As an input to the network, both data in time domain and those in the frequency domain obtained using the Fourier transform are used. After the experiment was performed, the results were compared to determine the potentially best time series for predicting, as well as the convenience of the domain in which better results are obtained.


Introduction
The future has five faces: innovation, digitalization, urbanization, community, and humanity. The scientific sector should develop each of them, but one that occupies a leadership position is definitely digitalization. It strives for the future every day and is struggling to overcome professional challenges, but in fact it is already the present. Modern technologies surround all of us, and they are our most reliable partners for the future. Through good-quality work and determination, clients will share with you their business needs and requirements, certain that you will find the right solutions for them.
Nowadays, many companies and organizations are involved in collecting data in large scale, in order to discover the necessary knowledge from them to help managers gain a competitive advantage. Timely and accurate analysis of such data is a difficult task, and it is not always possible to do it using conventional methods. Considering the effect that could be obtained, new horizons are opening, and challenges are created for researchers in order to extract useful information [1].
The concept that is very important and where more companies are investing in development is data science in order to find new ways to discover the real needs,

Methods and techniques of problem solving
The development of the neural network is currently oriented in two directions. The first is to increase the availability of modern computers and develop software tools for easy use, which enables the rapid development of neural networks by the individuals and the groups that has only basic knowledge about these areas. Other direction is the notable success of neural networks in areas where traditional computer systems have many problems and disadvantages. Nevertheless, there are many other methods that deal with the same or similar problems, so some of them will be listed.
A method that is increasingly used in predicting financial time series is support vector machines (SVM). There are many scientific papers comparing this method with neural networks in that which is more precise, which corresponds better to the set goals and its advantages in relation to the others [2,3].
As a commonly used method in solving this type of problem, there is also a random walk method. It is used as a financial theory that describes changes in the stock market as accidentally and unpredictably. Changes have a statistical distribution, and an appropriate model is developed. Then statistical testing of the hypothesis is performed, and a certain conclusion is made, whether price changes depend on one another or are completely independent.
In finance, the main problem is unstable nature of observed time series and its heteroscedasticity, making it impossible to apply certain time series models. This study empirically investigates the forecasting performance of generalized autoregressive conditional heteroscedastic (GARCH) model for NASDAQ-100 return over the period of 6 years, which prove to be a financial time series characterized by heteroscedasticity. Volatility performance is found to be significantly improved. Generally, ARCH and GARCH model along with their extensions provide a statistical stage on which many theories of asset pricing, portfolio analysis, value at risk, or index volatility can be exhibited or tested. Volatility has been the subject of many researches in financial markets, especially as an essential input to many financial decision-making models. Investment decisions strongly depend on the forecast of expected returns and volatilities of the assets. The introduction of ARCH model has created a new approach and has application for financial econometricians, becoming a popular tool for volatility modeling and forecasting [4].
Also known as econometric models for time series are generalized autoregressive conditional heteroscedastic and exponential generalized autoregressive conditional heteroscedastic (EGARCH), but in other papers, in comparative analysis they have proved less effective than NARX, so in this paper, they will not be considered or compared to the network [5].
Traditionally, Box-Jenkins or autoregressive integrated moving-average (ARIMA) model has been dominating over time series for forecasting the time series and includes the identification, evaluation, and checking of the suitability of the selected time series model. Although it is rather flexible and can be used for a large number of time series, the main limitation is the assumption of the linearity of the model, and it is used to model nonstationary time series. The model cannot explain nonlinear behavior, which is at the core of financial time series. The connection between conventional statistical approaches and neural networks for this use is complementary. The neural network is not transparent and has the corresponding stochastic part. It should be trained several times, after which the average value is taken to see how stable the solution is obtained afterwards. Also, statistical predictive techniques have reached their limitations when it comes to nonlinearity in data, while neural networks increasingly (except in the prediction) are applied in the classification and pattern recognition [6,7].

NARX neural networks
Neural networks are computer simulations programmed to learn on the basis of available data. They are used to solve a wide range of problems related to clustering, classification, pattern recognition, optimization, function approximation, and prediction. They are characterized by the layers-the input layer, the hidden layer, the output layer from the network, and the connections between all of them. The number of these connections along with the weight coefficients represents the real power of the neural network. Input neurons accept information, while output neurons generate signals for specific actions [8].
The types of networks are grouped into five main classes: • Single-layer feedforward networks Depending on the algorithm, it determined what kind of network propagation will be in relation to the type of network. The most important thing in this paper is the hidden layer whose number of nodes determines the complexity for which a prediction model is made. The activation function as an indispensable part is necessary for the neural network to be able to learn nonlinear functions, especially because of their importance to the network. Without nonlinearity, the network would be able to model only linear data dependencies.
By combining linear functions, a linear function is obtained, so it is advisable to choose a nonlinear function for the activation function. The network compares the obtained and expected results and, based on this, if there are differences, modifies the neural connections in order to reduce the difference between the current and the desired output. During the learning process, the existing synaptic weights are corrected in order to get a better and more reliable output. The net is trained continuously, until the samples do not lead to a change in coefficients. As a good and highly efficient predictor of time series, NARX neural networks are used very often. The structure of NARX neural network is shown in Figure 1.
Previously, for predicting time series, linear parametric models such as autoregressive (AR), moving-average (MA), or autoregressive integrated moving-average model were used. They were not able to solve problems related to nonstationary signals and signals whose mathematical model is not linear. On the other hand, neural network is a powerful tool when applying to problems whose solutions require knowledge that is difficult to specify and express, but there is sufficient representation in examples and practices.
Nonlinear autoregressive exogenous neural network is a dynamic neural architecture that is used to model nonlinear dynamic systems. The nonlinear autoregressive (NAR) network differs in that it has, besides the standard input, another additional time series with external data, which gives an increased accuracy of the prediction. For applications related to the prediction of time series, it is designed as a feedforward neural network with time delay (TDNN). The equation represented by the NARX model [8] is where is the output of the NARX neural network with delays (2 legs) and is input of the NARX neural network with delays (2 legs).
In the NARX neural network model, multilayer perceptron (MLP) is used. The task of the program is to learn how to assign to the new, unmarked data the accurate output. When the variables that need to be predicted are continuous, then the problem is defined as regression. If the predicted values can only contain a limited set of discrete values, then the problem is defined as a classification. Each time the data is trained, the results can give a different solution considering the initial weight w and the value of the bias b.

Fourier transform
The methods based on Fourier transform have a great application in all areas of science and engineering. Fourier transform is used in signal processing, for solving differential equations, or in analyzing the dynamics of the market and stock market with the same possibilities. In addition to many other tools, the frequency used  along with transformation is convolution, which is often applied in the same areas. It is known that it is not possible to define the product of two random distributions, and there it finds its application, especially in the field of finance (securities) when performing the necessary formulas.
Fourier series represents a periodic function as an infinite sum of the sinus and cosine functions in the domain of frequency expressed below (Eq. (2)). The application of the price system of options, which is uniquely determined by the characteristic functions within the Fourier analysis, is shown. To describe, the random stochastic Levi processes are often mentioned in the fields of insurance and finance, as well as the assumption of the Black-Scholes model that the price of the substrate is followed by the geometric Braun motion model. This is precisely one of the disadvantages with the assumption of constant volatility over time. It is difficult to determine whether these are really disadvantages or simply the market is ineffective, which is significant to investors as information about the risk protection they are trying to achieve: However, Fourier transform is rarely suitable for the processing of nonstationary signals or those whose frequency content changes over time, where the periodic signal should be centered around the integer multiplicity of selection frequencies.
Then this signal is divided into smaller time segments and analyzes the frequency content of each individual part. Because of that, there is wavelet transformation with the possibility of dilatation and translation of waves as the basic function of transformation [9].

Data description and data analysis
The six Forex major traded currency pairs are EUR/USD, GBP/USD, AUD/USD, USD/CAD, USD/JPY, and USD/CHF. In this chapter for the time series analysis, a pair of EUR/USD was selected considering its share in the total trading volume (27%). Often, cross currency pairs, which do not include the US dollar, have a smaller trading volume and larger spreads than the major currency pairs, so they are less suitable for analysis.
Unlike Forex, which is characterized by large oscillations, it may be better to notice a certain trend that changes slowly over time. Based on this, it might be assumed that the S&P 500 index will show better features related to the prediction of the series.
Relevant historical currency pair data for more than 10 years have been downloaded from the website of Fusion Media Limited [10]. In the analysis of time series from the stock exchange, a representative index S&P 500 was used with the historical data downloaded from the website of Yahoo! Finance [11].
The collected data are related to the prices (high, low, open, close) in the period from 2003 to September 2018, for each day four prices, but the close price will be used in the analysis. The graph of the time series for the S&P 500 stock index in the time domain, returns based on 3950 observations in the period 31/12/2002-07/09/2018 is shown in Figure 2.
After determining the returns and application of FFT (fast Fourier transform), the graph shown in It is also concluded that prices don't have the normal distribution and deviate significantly from it, but returns have significantly better statistical characteristics.
In this case, the time series of the returns are much closer to the normal distribution, and the normal distribution with thick tails occurs. This shows that unexpected events occur more often than in the normal distribution, which is characteristic of the analysis of financial data and forecasts.
Linear dependence, which is very important for observation during the analysis of time series, is autocorrelation. In general, there is doubt whether the explanatory variables are determined by a stochastic member or there is an exact linear dependence between the explanatory variables. The absence of autocorrelation means that random errors are uncorrelated and that the covariance between them is equal to 0. This would mean that there is no any pattern in the correlation structure of random errors. Otherwise if there is autocorrelation and covariance is different from 0, then accidental errors are correlated and followed by a recognizable pattern  in movement. In this case the results of the statistical tests are biased, the confidence intervals are imprecise, and the prediction is unreliable. Autocorrelation can also be accurate if it is a consequence of the nature of the data and false if the model is incorrectly set.
The Ljung-Box Q statistical test is significant for analyzing those time series in which autocorrelation is different from 0. Ideally, a series of errors should be a process with an independent random variable from the same distribution, and there is a white noise; however, often in the series of errors, there is a dependence. The greater absence of autocorrelation or its complete absence indicates that the market is mature.
The autocorrelation function of S&P 500 index and EUR/USD currency is shown in Figures 6 and 7, respectively. Figure 6 shows the deviation of the autocorrelation value beyond the confidence interval for the first 2 legs, and therefore, in the network architecture, the default value 2 should be used as a time delay. Due to the lack of statistically significant  autocorrelation in the data, the NARX neural network will be used for analyzing the time series.
Observing variances of random errors and their differentiation by individual observations, there is the phenomenon of heteroscedasticity. The cause of this phenomenon may be specification errors, exclusion of an important regressor whose influence will be covered by the error or the existence of extreme values in the sample. As a method of elimination, the method of the least squares is applied. The idea is that in the process of minimizing the sum of the quadrate of the residual, a smaller weight is given to those residues that are greater by absolute value and vice versa.
Engle's ARCH test allows to see if there is heteroscedasticity or not. For the obtained value 1 as a result of the test, it was established for both time series that the zero hypothesis is rejected (the residual series does not show heteroscedasticity), so it can be concluded that it exists in both time series.

Development of the NARX network architecture
In this section, a brief review of well-known and useful mathematical tools from the field of machine learning is presented. For predicting indexes and prices on Forex and stock exchanges, NARX neural network architecture is developed. The input data for the analysis both in the time domain and in the frequency domain are obtained after applying the Fourier transform to the historical data [12,13].
The tool used is MATLAB® with a special set of functions known as the Neural Network Toolbox applicable to finance. With the help of the functions, a training, evaluation, and test set can be generated from the original set with the corresponding percentile division. Then, several NARX networks are generated that are trained on train data. Subsequently, networks are evaluated on the evaluation data in order to determine the network with appropriate behavior and predict this behavior on the test set of data.
The NARX model can be implemented in many ways, but the simpler is developed by using a feedforward neural network with the embedded memory plus a delayed connection from the output of the second layer to input. In practice it was observed that forecasting of a time series will be enhanced by analyzing related time series. A two-layered feedforward network is used, where the sigmoid function is in a hidden layer and that is the most common form of a transmission function, which is nondecreasing and nonlinear. The linear transfer function is in the output layer. The neural network is shown in Figure 8.
The prediction method in the given experiment applies to changes in the exchange rate or changes in the stock exchange index over a certain period of time. The goal is to go beyond the assumption and to notice the specific pattern of observations along with the usual fluctuations. These fluctuations would mean that a certain inheritance or some kind of random variation occurred over a period of time. Finally, based on the data, a series with damped random fluctuations should be obtained, which indicates exactly the long-term trend or trend present in the time series, and then it is used to predict the future values of the time series.
Levenberg-Marquardt (LMA), a combination of gradient descent and Gauss-Newton algorithm, is used as an algorithm for learning, as opposed to Elman's recurrent networks, using gradient discent with a momentum. It is known as the advanced and fast algorithm for nonlinear optimization, whereby, unlike the Quasi-Newton algorithm, LMA does not need to compute Hessian matrix, so it has significantly better performance. The Jacobian matrix, which contains the first network error, is used, and it is expressed by a backpropagation algorithm, which is easier than calculation of the Hessian matrix. It is necessary to reach the proximity of the minimal error function and get closer as soon as possible [14].
The data for analysis are divided in the following way: 70% training, 15% evaluation, and 15% test.
After training the network, the results are shown in Figures 9-11. The epoch represents the number of iterations during the training in which it was attempted to minimize the error function.
The network architecture is such that the initial number of hidden neurons is set to 10 with 2 time delays. The network will be applied to returns instead of prices for both time series that are observed in the time and frequency domain. The smallest mean squared error occurred in the third epoch and is 1.11455 × 10 −4 . It represents a deviation of the predicted value in relation to the actual value. If the number is closer to 0, it means that the results obtained are more accurate.
The training error is significantly higher than the error during testing, which means that the model did not overfitting as shown in Figures 10 and 11.
After ten consecutive training of the network, the smallest mean squared error after appeared in the seventh epoch and is 1.11092 × 10 −4 . As in the analysis of the previous time series, the same training algorithm was used, and the subsets for training, validation, and testing were obtained for the same percentile values. The network architecture is identical with sigmoid function in the hidden and linear function in the output layer. In the analysis of this time series, the smallest mean squared error occurred in the ninth epoch and is 3.71 × 10 −5 . It represented the deviation of the predicted values in relation to the actual value.
The first network for the stock exchange index S&P 500 was tested as a feedforward network. The smallest MSE for training was 1.23081 × 10 −4 ; for validation, 1.0336 × 10 −4 ; and for testing, 1.1380 × 10 −4 . The network for the currency pair EUR/USD was tested also as a feedforward network. The smallest MSE was smaller than for the first network: 3.6199 × 10 −5 for training, 3.4246 × 10 −5 for validation, and 3.4792 × 10 −5 for testing.
The algorithm is also trained at 70% of the data, evaluated at 15%, and tested at 15%. Each network consists of two hidden layers. The first hidden  layer has ten neurons with a sigmoid transfer function, and the other one is a neuron with a linear transfer function. In the second network, a smaller average mean squared error was detected than in the first one. Also, the standard deviation of the secondary squared error for the other network is lower than for the first one for all three stages of training, validation, and testing, respectively. The results for each iteration and summary of mean squared error are presented in Tables 1 and 2 for S&P 500.
The results for each iteration and summary of mean squared error are presented in Tables 3 and 4 for EUR/USD currency pair, respectively.
Unlike the analysis of time series in the time domain, in the frequency domain, it is interesting to consider the spectrum of the amplitude (relative share of a certain frequency component relative to the other) of the historical price for the stock index S&P 500 and the currency pair EUR/USD in several different aspects. These analyses include the spectral analysis of time series, which are usually used for stationary time series. This is a good assumption for adjusted stock prices in the frequency domain statistics [15].

Histogram of time series errors for time series S&P 500. Fourier Transforms -Century of Digitalization and Increasing Expectations
For converting to the frequency f k , it should be emphasized that, if daily prices are used as an input signal, the sampling frequency is equal to 1 [1/day], which means that the frequencies must be reallocated.

Summary
Mean squared error    The unit of a new set of discrete frequencies is [1/day] and has the form of the real frequencies required in this analysis. Also, according to the sampling theorem, it is known that only those signal components who having a frequency less than or equal to Fs/2 = 0.5 days −1 , without aliasing effect, will be measured. Considering these facts, it is necessary to limit the frequency coordinates to the range from 0 to 0.5.
In order to better understand the shape of the spectrum, a log-log scale is used, and logarithm of the amplitude values obtained after application of FFT is used. Observing the slope of such a curve could be observed if the spectrum of the amplitude is close to the special power-law form 1/f. Using a logarithmic format is a good way to avoid overestimating high-frequency components.
After applying FFT on prices and returns, equivalent time series in the frequency domain are obtained. As in the above procedure, in order to better detect the spectrum, a modulus representing the amplitude was found, and then the result was logarithmic. The obtained values of the S&P 500 index and EUR/ USD currency pair were used to train the NARX neural network. The average mean squared error obtained after ten consecutive training is 1.5738 × 10 −1 and 4.8713 × 10 −1 , respectively, which represents a significantly higher number than the one obtained in the time domain. The conclusion is that, regardless of the time series being analyzed, the results are significantly worse and the prediction is less reliable.
The simulation performed with the input that represents the logarithmic value of the amplitude and the frequency as an exogenous input did not show the possibility of good training and convergence even after the maximum possible 1000 iterations or the corresponding statistical characteristics, and hence, its analysis would make no sense.
Due to its wide practical application in various fields, Fourier transform is increasingly in the focus of international scientific meetings, as well as numerous publications (scientific monographs, journals, chapters, etc.), whether it is economics, biomedicine, chemical engineering, electronics, or art [16].

Various computational intelligence methods in finance
Considering the domain in which one of the methods of computational intelligence is applied in this chapter, other methods are often applied. Bankruptcy prediction is one of the main issues threatening many companies and governments and a complex process that consists of numerous inseparable factors. Financial distress begins when an organization is unable to meet its scheduled payments or when the projection of future cash flows points to an inability to meet the payments in the near future. The causes leading to business failure and subsequent bankruptcy can be divided into economic, financial, fraud, disaster, and others. With more accurate  bankruptcy detection techniques, companies could take some preventive measures in order to minimize the risk of falling to bankruptcy [17]. There are two dominant approaches when it comes to predicting bankruptcy: one that used multi-discriminant analysis, univariate approach (net income to total debt has highest predictive ability), and developing stochastic model such as logit and probit. The other one approach refers to using artificial intelligence and adapts it for predicting bankruptcy (decision tree, fuzzy set theory, genetic algorithm, and support vector machine). Also neural networks such as BPNN (backpropagationtrained neural network), PNN (probabilistic neural networks), or SOM (self-organizing map) could be developed. In this paper, three LC models are tested whether they are able to improve Altman Z-score as a benchmark model for bankruptcy prediction. Even though LC method shows more accurate results, Altman model behaves slightly better for gray-zone companies, where it is important to reduce number of bankrupt firms identified as an active.
In modern approaches it is necessary to introduce different approaches to modeling similarity specially using IBA with two main steps to perform it. First thing is data preprocessing (data normalization, detection of attribute nature, and their potential interaction), where normalization functions may be adapted depending on data range and distribution. Also, it is recommended to use correlation to detect similar nature between attribute data, because the existence of significant correlation in attribute data could overemphasize certain attributes and cause incoherent model results. IBA similarity modeling (attribute-by-attribute comparison, comparison on the level of the object and general approach) show what kind of aggregation is appropriate for similarity modeling.
In this case it is proven that IBA-based similarity framework has a solid mathematical background and can also be expanded to model nonmonotonic inference. The practical advantage is evaluated on two numerical examples. The first example confirms motivation and reasoning behind the novel OL comparison with importance of when one object's attributes is logically dependent or can be compensated by another attribute. In the second example the proposed similarity framework is applied for predicting corporate bankruptcy with different KNN classifiers [18].

Conclusion
Analysis of time series is a specific topic, which is indispensable in dealing with the data science and statistical analysis. By combining an analysis with a tool such as a neural network, especially in an increasingly important area such as finance, it is certain that in the future it can conquer new territories and have a global impact. Looking for the financial protection from losses and safe investments without risky investment, it is necessary to apply modern methods with continuous upgrading and improvement. In cooperation with existing platform with varied parameters and transactional data, this tool would be a good prerequisite for successful forecasting of trends and secure business.
The obtained results of the time series analysis confirmed the possibility of a good prediction. Better forecasting can be done for time series in Forex (EUR/ USD), in the time domain without applying Fourier transform to input data. In this sense, NARX proved to be a good method for solving the given type of problem in the time domain, but in the frequency domain, it is recommended that the analysis be carried out by a classical feedforward neural network with the backpropagation algorithm. The results of the research indicated that NARX is capable of providing a certain amount of security to those entities that invest their funds, as well as to point out future expectations. On the other hand, the results of this paper give only © 2019 The Author(s). Licensee IntechOpen. This chapter is distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/ by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. a proposal and advice on how to behave on the market during trading. It should always be cautious, given the already mentioned market variability. Timeliness is also important, because when a particular news arrives on the market, then it reacts to certain changes. The news is then incorporated into the price and the market returns to the previous state where it was before the news arrived.
Proposals for the improvement of the neural network are: • Include new input parameters that can be reached by new research, or do a different preparation of data for the training to make sure of the credibility of this network in a dynamic environment.
• Change the number of neurons in the hidden layer, time delay, or activation function in the hidden and output layer.
• Use network results as entering the new network together with a change in the time period, which can give a broader picture of the trend of the observed currency pair or stock exchange index.