## 1. Introduction

Time series has a long history in social sciences, especially in economics and finance. As it is well known, much of economics and finances are concerned with modeling dynamics, and systematization of data over time was a subject that appeared early. In particular, two empirical topics become important when working with time series in social sciences: inferences and forecasting. The cumulated historical data permitted to applied statistical methods in order to find evidence of causation between social variables, finding some support to social theories. Considering the nonexperimental nature of the social sciences, this also encourages the development of statistical techniques. In fact, while in physics, it is relatively easy to get hundreds of thousands of data for a given time series, in economics there are often only 50 or 100 data for a time series, and maybe we can obtain thousands of data in financial series. For this reason, much of the statistical effort, in particular econometric effort was focused on developing powerful statistical tests, considering the availability of small samples. This is an important different approach between econometrics and for example, statistical mechanics in theoretical physics.

We can identify two main groups in time series econometrics: univariate time series analysis concerning with techniques for the analysis of dependence in adjacent observations. It has increased importance since 1970 based on the main ideas underlying in [1]; multivariate time series analysis based on the vector autoregressive (VAR) models, made popular by [2]. In the first group, we find all the autoregressive integrated moving average (ARIMA) models and the related generalized autoregressive conditional heteroscedasticity (GARCH) models developed by [3]. The second group is a generalization of the AR models and we can find two important developments based on this: cointegration proposed by [4] focusing on finding a statistical relationship between variables; and noncausality test developed by [5], which takes the concept of predetermination try to test if a variable causes another. Much of the development in time series econometrics is found in books such as [6–18].

In summary, dependence and causation are two important topics in time series econometrics and time series analysis. These topics are related with the importance of inference and forecasting in social sciences. Econometrics has been focused in developing powerful test considering the available small samples. Most of these developments are based on linear models even if there are some developments considering nonlinearities; see for instance [19, 20].

Time series analysis in econometrics is mostly based on observations belonging to the set of the real numbers. Some variables can be categorical such as dummy variables. However, in this chapter, we will talk about a different approach that is known as symbolic time series analysis (STSA). It has been originally applied to physics and engineering as a statistical methodology to detect the very dynamic of highly noise time series. The application to social sciences such as economics or finance is very recent and there are some novel developments.

As mentioned before, the application of STSA in social sciences requires a different approach due to data limitation. In this sense, the design of powerful test considering the availability of data is crucial. As abovementioned, dependence and causation are two important topics. In this sense, we review an independence test and a first approach on testing noncausality, both based on STSA. The information theory was adopted as an approach to analyze the symbolic time series and the approximation of Shannon Entropy as an important measure, applied to test design.

The chapter is organized as follows. Section 2 presents the symbolic time series approach and its relation with the symbolic dynamics. In Section 3, we review some of the literature of STSA applied to the sciences. In Section 4, the information theory approach and Shannon Entropy measure is explained. Section 5 presents a review of the independence symbolic test. Section 6 focuses on causality test based on STSA. Section 7 discusses the difference between the proposed symbolic noncausality test and the traditional and well-known Granger noncausality test. Finally, in Section 8, we draw some conclusions and present some future lines of research.

## 2. Symbolic time series analysis

The concept of symbolization has its roots in dynamical systems theory, particularly in the study of nonlinear systems, which can exhibit bifurcation and chaos. In [21], it is asserted that symbolic dynamics is a method for studying nonlinear discrete-time systems by taking a previously codified trajectory using strings of symbols from a finite set, also called an alphabet. According to [22], symbolic dynamics and symbolic analysis are connected but are different concepts. In fact, the former is the practice of modeling a dynamical system by a discrete space. However, the latter is an empirical approach to characterize highly noisy data by considering a partition, discretizing the data, and obtaining a string representing the very dynamic of the process.

As asserted by [23], symbolization involves transformation of raw time series measurements into a series of discretized symbols that are processed to extract information about the generating process. In this way, we can search for nonrandom patterns and dependence by transforming a given time series {*x*_{1}*, x*_{2}*,…, x _{T}*} into a symbolic string {

*s*

_{1}

*, s*

_{2}, …,

*s*}.

_{T}The STSA approach is easy to apply but the definition of the right partition is the most difficult thing to do. Generally, it applied an equiprobable partition implying to take the empirical distribution of a given time series {*x*_{1}, *x*_{2},…, *x _{T}*} and establishing two or more equally probable regions. For instance, for a Gaussian time series, we can define two equally probable regions considering as partition the mean equal to zero. After that, we can assign the symbol

*s*0 for negative values and

_{i}=*s*1 for positive ones. In this way, we transform a continuously random series into a discrete string similar to the outcomes from flipping a coin.

_{i}=## 3. STSA in applied sciences

In [23], the applications of STSA techniques to the different fields of science are reviewed. According to the authors, the different applications suggest that symbolization can increase the efficiency of finding and quantifying information from the systems. Mechanical systems were one of the first applications where symbolic analysis was successfully used to characterize complex dynamics. In [24–26], symbolic methods to the analysis of experimental combustion data from internal combustion engines are applied. Their objective was to study the onset of combustion instabilities as the fueling mixture was leaned. STSA has also been applied in Astrophysics and Geophysics. For instance, [27] analyzes weak-reflected radar signals from the planet Venus to measure the rotational period. In [28], a binary symbolization to analyze solar flare events is utilized. Biology and Medicine is another field where STSA has been applied. There have been many recent applications of symbolic analysis for biological systems, most notably for laboratory measurements of neural systems and clinical diagnosis of neural pathologies. STSA has been applied in neurosciences. In [29, 30], symbolization data is applied to equal-sized interval to partition EEG signals to identify seizure precursors in electroencephalograms. [31] proposed a new damage localization method based on STSA to detect and localize a gradually evolving deterioration in the system. They assert that this method could be demanded for implementation in real-time observation application such as structural health monitoring. In [32], the STSA is used in human gait dynamics. The results of this study can have implication modeling physiological control mechanism and for quantifying human gait dynamics in physiological and stressed conditions. In [33], the heart-rate dynamics is studied by using partitions aligned on the data mean and ±1 and ±2 sample standard deviations, for a symbol-set size of 6. In [34], the prevalence of irreversibility in human heartbeat is analyzed applying STSA.

Application of symbolization to fluid flow measurements has spanned a wide range of data types from global measurements of flow and pressure drop, to formation and coalescence of bubbles and drops, to spatiotemporal measurements of turbulence. In [35], an approach for transforming images of complex flow fields (as well as other textured fields) into a symbolic representation is developed. In [36], STSA is applied to the networks of genes, which is important underlying the normal development and function of organisms. Information about the structure of the genome of humans and other organisms is increasing exponentially. In [37], equiprobable symbols are used for analyzing measurements from free liquid jets in order to readily discriminate between random and nonrandom behavior. In [38], STSA is applied to the detection of incipient fault in commercial aircraft gas turbine engines. In [39], combustion instability in a swirl-stabilized combustor is investigated using STSA. Chemistry-related applications of symbolic techniques have been developed for chemical systems involving spontaneous oscillations or propagating reaction fronts. In [40], a type of symbolization for improving the performance of Fourier-transform ion-cyclotron mass spectrometry is applied. Artificial Intelligence, Control, and Communication are fields where symbolization has been incorporated. In [41], a phase-space partitioning to model communication is used. An example application of symbolization to communication is found in [42], utilizing small perturbations to encode messages in oscillations of the Belousov-Zhabotinsky (BZ) reaction. In robotics, a symbolic time series–based statistical learning method to construct the generative models of the gaits (i.e., the modes of walking) for a robot, see [43], has been developed. Efficacy of the proposed algorithm is demonstrated by laboratory experimentation to model and then infer the hidden dynamics of different gaits for the T-hex walking robot. In [44], an algorithm to intuitively cluster groups of agent trails from networks based on STSA is proposed. The authors assert that temporal trails generated by agents traveling to various locations at different time epochs are becoming more prevalent in large social networks. The algorithm was applied to real world network trails obtained from merchant marine ships GPS locations. It is able to intuitively detect and extract the underlying patterns in the trails and form clusters of similar trails.

The methods of data symbolization have also been applied for data mining, classification, and rule discovery. In [45], rule discovery techniques to real-valued time series via a process of symbolization are applied. Finally, we find some applications of STSA in Social Science. In [46–48], STSA and minimal spanning tree (MST) are applied to construct cluster of financial asset with application to portfolio theory. Utilizing a similar methodology, in [49], the dynamics of exchange market is studied, and in [50], the international hotel industry in Spain is analyzed. In [51, 52], STSA and entropy are applied to measure informational efficiency in financial markets.

## 4. Information theory and Shannon entropy

The term entropy was first used by Rudolf Clausius in [53] related to the second law of thermodynamics. Subsequently, the communication theory [54] used the Shannon entropy as a measure of uncertainty where the maximum entropy corresponds to the maximum degree of uncertainty. In this sense, a random process will take the maximum entropy value. In fact, English language is not a random process; some patterns such as “THE” are more probable than sequences such as “DXC”. Note, that in a random process, the two sequences should have the same probability. This principle is very relevant because if a symbolic string is random, the entropy should be the maximum.

The entropy measure (H) must meet the following conditions:

*H*(*P*) should be a function of the probability distribution of the n events expressed as the vector*P =*(*p*_{1},*p*_{2}, …,*p*)._{n}(

*Continuity*),*H*(*P*) should be a continuous function of vector*P*.(

*Symmetry*), the measure should be unchanged if the outcomes*p*are re-ordered._{i}(

*Expansible*), Event of probability zero should not contribute to the entropy,*H*(*p*_{1},*p*_{2}, …,*p*0)_{n},*= H*(*p*_{1},*p*_{2}, …,*p*)._{n}(

*Minimum*), the measure should take value 0 when there is not uncertainty.(

*Maximum*), the measure should be maximal if all the outcomes are equally likely. It means*p*_{1}*= p*_{2}*= … = p*1_{n}=*/n*.For equiprobable events, the entropy increases with the number of outcomes.

*H*(*p*_{1}= 1/(*n*+ 1)*, …,p*_{n + 1}= 1/(*n +*1))*> H*(*p*_{1}*=*1*/n*,…,*p*1/_{n}=*n*).

In [54], the Shannon entropy function is proposed:

The entropy is frequently measured in bits by using log base 2 satisfying all the properties already mentioned. Note that the maximum property is confirmed solving the following Lagrangian expression (2).

The Shannon entropy is concaved with a global maximum when all the probabilities are equal. In addition, when *p _{i} =* 0

*,*the convention that 0

*.log*0

*=*0 is used. Thus, adding zero, probability terms do not change the entropy value.

In order to clarify the concept of Shannon, consider two possible events and their respective probabilities *p* and *q =* 1*−p*. The Shannon entropy will be defined by Eq. (3).

**Figure 1** shows graphically the function shape, note that the maximum is obtained when the probability is 0.5 for each event. This case corresponds to a random event; on the other hand, note that a certain event (when probability of one event is 1) will produce entropy equal to 0.

In general, [55] showed that any measure satisfying all the properties must take the following form:

In order to normalize the Shannon entropy, *c* usually takes the value 1/*log*_{2}(*n*) allowing to compare events of different sizes.

## 5. Symbolic independence test

STSA seems to present a good performance when detecting independence in time series. A variety of dynamical processes are present in economics. Linearity, nonlinearity, deterministic chaos, and stochastic models have been applied when modeling a complex reality. In [56], a runs test is designed, asserting that the problem of testing randomness arises frequently in quality control of manufactured products. It is remarked that detecting dependence in time series is an essential task for econometricians and applied economist. In [57], the well-known BDS test is introduced, considered as a powerful test to detect nonlinearity. In [58], a simple and powerful test based on STSA is proposed and the results are compared with the BDS and runs test. On one hand, it is found that BDS is not able to detect processes such as the chaotic Anosov and the stochastic processes nonlinear sign model (NLSIGN), nonlinear autoregressive model (NLAR), and nonlinear moving average model (NLMA). On the other hand, runs test cannot detect the chaotic Anosov, the logistic process, the bilinear, the NLAR, and the NLMA stochastic processes. The experiments show that the test based on STSA has no problem detecting all these dynamics. It is concluded that proposed test is simple, easy to compute, and is powerful with respect to the other two tests. In particular, for small samples, it is the only one able to detect models such as chaotic Anosov and nonlinear moving average (NLMA). Besides, the test is applied to financial time series to detect nonlinearity on the residuals after applying a GARCH model. In this case, the BDS rejected the independence few times whereas the SRS test still detects nonlinearity in the residuals. It seems that BDS considers that the GARCH(1,1) model is a good model most of the time. However, the symbolic test suggests that GARCH(1,1) would not be a good model considering all the nonlinear components.

Here, we review briefly the test and repeat some experiments comparing the results with the well-known BDS and runs tests. At first, let us consider a finite time series generated by an independent or random process-sized *T** {*x _{t}*}

_{t = 1,2,…,T*}. Define a partition in the series in “

*a*” equiprobable regions obtaining the symbolized time series {

*s*}

_{t}_{t = 1,2,…,T*,}where each symbol

*s*takes a symbolic value from the alphabet

_{t}*A =*{

*A*

_{1}

*,A*

_{2},…,

*A*}. Since, we want to derive a general statistic for different alphabet sizes

_{a}*a*and different subsequences lengths

*w*, we have to make two considerations: (1) from now, we will call

*n*to the quantity of possible events. That is

*n = a*, where for the simplest case (

^{w}*w =*1) implies

*n = a,*then the quantity of events is equal to the symbol-set size; (2) in practice, we have a finite sample size

*T**, there is no problem for

*w =*1, but when we compute subsequences or time-windows

*w*of consecutive symbols we loss observations. For example, when we compute the frequency for two consecutive symbols, we have a total sample size

*T*−*1. In general, we can define the sample size

*T = T* + w−*1, again for the trivial case

*w =*1,

*T* = T*

Note that defining *S _{i}* for

*i =*1,2,…,

*n*as the sum of the total

*i*events in the time series, we can derive the multidimensional variable

*S =*{

*S*} being distributed as a multinomial with

_{i}/T*E*(

*S*)

_{i}/T*=*(1

*/n*),

*Var*(

*S*)

_{i}/T*=*(1

*/n*)(

*n-*1)

*/nT*and

*Cov*(

*S*)

_{i}/T,S_{j}/T*= −*(1

*/n*)(1

*/nT*)

*∀i ≠ j*. As we will see, frequencies of the events should be important in the statistic and the vector of the

*n*frequencies

*S*could be approximated by a multivariate normal distribution

_{i}/T*N*(1

*/n,σ*

^{2}

*Σ*) where

*σ*

^{2}is (1

*/nT*) and

*Σ*is a idempotent matrix as in (5)

For convenience, we can define the normalized vector variable {*ε _{i}*}

*=*{(

*S*)

_{i}/T*-*(1

*/n*)}

_{i = 1,2,…,n}having a multivariate normal distribution

*N*(

*ø,σ*

^{2}

*Σ*), being ø, the null vector. Then, the statistic can be defined as a quadratic form in random normal variables (6).

In [58] is applied the distribution of quadratic forms in normal variables presented in [59]. *X =* (*ε*_{1}/*σ,ε*_{2}/*σ*,…,*ε _{n}/σ*) is distributed multivariate normal

*N*(

*ø,Σ*). The theorem indicates that

*tr*(

*ΑΣ*)

*= n*−1, and thus

*X’ΑX*distributes Chi-square with (

*n*−1) degrees of freedom. In this case,

*Α*is the identity matrix

*I,*and

*Σ*is symmetric, singular, and idempotent. Remembering that

*σ*

^{2}

*=*(1

*/nT*)

*,*then we obtain that the distribution of the symbolic randomness statistic (SRS) as in (7).

Note that in practice computing the statistic is very simple. We just have to consider the symbols (*a*) and subsequences or length (*w*) and compute the frequencies for each event (*n = a ^{w}*) in the time series.

The algorithm to compute the test is as follows:

Step 1: Considering time series {*x _{t}*}

_{t = 1,2,…,T*}, compute the empirical distribution, and define equiprobable regions according to the quantity of symbols or the alphabet size.

Step 2: According to the partition, translate {*x _{t}*}

_{t = 1,2,…,T*}into {

*s*}

_{t}_{t = 1,2,…,T*}, the symbolic time series when

*w =*1.

Step 3: Compute different symbolic time series for different lengths *w*, remember that the obtained series in step 2 corresponds to *w =* 1.

Step 4: For each *w,* compute the frequency of the n different events *S _{i}/T* for

*i =*1

*,*2

*,…,n*.

Step 5: For each *w,* compute the *SRS*(*a,w*) *= Tn*{*Σ*(*S _{i}/T* - 1

*/n*)

^{2}} as shown in Eq. (7).

Step 6: Compare the *SRS*(*a,w*) with the Chi-2 with *n-*1 degree of freedom at 0.05 of significance, under the independence null hypothesis. When *SRS*(*a,w*) is larger than the critical value we reject the null hypothesis.

In [58], it is found that the statistic introduced in (7) is related to the Shannon entropy (H). We can derive the approximation expressed in Eq. (8).

Note the generalization implied in STSA permits to study different dynamical process. For instance, consider a string of the first 3000 letters from the book “*A Christmas Carol*”, *s*_{1} *=* {*marleywasdeadtobeginwith…scroogecar*} and a random string of 3000 letters from an alphabet of 26, *s*_{2} *=* {*iskynbmhjp…vbbihjfkk*}. Imagine testing this kind of process with BDS or runs test. However, note that would be easy to test this dynamics with the symbolic test. In this case, we can define an alphabet of 26 letters and the string. On the one hand, applying the *SRS*(26*,*1) and *SRS*(26*,*2) for the *s*_{1}, we obtain the following values 2102.40 and 12331.26, respectively. On the other hand, SRS(26,1) and SRS(26,2) for the string *s*_{2} are 25.79 and 690.26, respectively. Considering that a Chi-2 with 25 degree of freedom at 95% is 37.65 and a Chi-2 with 675 degree of freedom (26^{2}–1) at 95% is 736.55. Since, the statistics for *s*_{1} are large than the critical value, we can conclude that the process is not random. However, since the statistics for *s*_{2} are less than critical values, we cannot reject the hypothesis of independence.

In [58] is shown that the test is conservative, rejecting the null hypothesis less time than expected. However, it is powerful in detecting nonrandom and nonlinear processes. Considering the four sample sizes, selecting two symbols and length 4 presents decent results in most of the cases. Selecting three symbols seems to be a relative good option for size of 200 or larger and three symbols for a sample size of 500 or larger. The best result is given for a sample of 2000 applying three symbols and length 4. **Table 1** presents the experiments using 1000 Monte Carlo simulations on Normal, Logistic, NLMA, Anosov, and NLSIGN processes reproducing the experiments in [58].

Note that the symbolic test is more conservative than BDS and Runs test when rejecting independence in a normal random process. However, the symbolic test is powerful in detecting nonlinearities in the studied processes. For a sample of 50, Logistic model is detected 100% by the symbolic test, but BDS detects 68%, and Runs test rejects independence 23.90% of the time. Logistic model is still hard to be detected by the run test when sample increases to 2000. Note that NLMA model is detected by the symbolic test when sample is 500 or larger, but it is not detected by BDS and Runs test. It is interesting to note that the chaotic process of Anosov is detected by the symbolic test for a sample larger than 500 but both BDS and Runs tests reject independence less than 6% of the cases. NLSIGN is hard to be detected, for a sample of 2000 the symbolic test detects more than 90% of the cases and Runs test detects 84% of the cases. However, BDS cannot detect the NLSIGN process. In [58] similar results are obtained, the proposed SRS is the only one that is able to detect chaotic Anosov and nonlinear process NLMA when T = 2000.

## 6. Symbolic noncausality test

The present section reviews the symbolic noncausality test (SNC) and discusses the differences with the classical Granger noncausality test. As in the case of independence test, the main idea here is to derive the asymptotic distribution for the statistic when there is no causality between the series. A full explanation of the test is shown in [60].

Let us consider that *X* and *Y* are two independent random time series sized T + 1 and the symbolized time series can be expressed as *Sx =* {*sx*_{1},*sx*_{2},..,*sx*_{T + 1}} and *Sy =* {*sy*_{1}*,sy*_{2}*,…,sy*_{T + 1}}. To test causality, we have to define two new series, grouping *Sx* and *Sy* in the following way:

(1) *Sxy =* {(*sx*_{1}*, sy*_{2})*,* (*sx*_{2}*, sy*_{3})*,…,*(*sx*_{t−1}*,sy _{t}*)

*,…,*(

*sx*

_{T},sx_{T + 1})}

(2) *Syx =* {(*sx*_{1}*, sy*_{2})*,* (*sx*_{2}*, sy*_{3})*,…,*(*sx*_{t−1}*,sy _{t}*)

*,…,*(

*sx*

_{T},sx_{T + 1})}

If the alphabet is composed by three symbols, the combination (*sx*_{t−1}*, sy _{t}*) takes a value from the set of nine possible events {(1,1), (1,2), (1,3), (2,1), (2,2),(2,3),(3,1),(3,2),(3,3)}. Note that each event should be independent with probability 1/9 (

*Sx*and

*Sy*are random). Only if at least one event were deviated from 1/9

*,*would there be evidence of noncausality.

An alphabet of *a =* 3 symbols determines *n =* 3^{2} = 9 possible events in the set of pairs {(*x*_{t-1}*,y _{t}*)} or {(

*y*

_{t−1}

*, x*)}. Considering “

_{t}*a*” symbols and the events

*n*=

*a*

^{2}, the vector of the

*n*frequencies

*Exy*and

_{i}/T*Eyx*could be approximated by a multivariate normal distribution

_{i}/T*N*(1

*/n,σ*

^{2}

*Ω*) where

*σ*

^{2}is (1/

*nT*) and

*Ω*is a idempotent matrix as in (9).

Following a similar approach as in Section 5, the statistics for the both hypothesis can be defined as in (10) and (11).

The term in brackets in (10), (11) are quadratic forms in random normal variables. Applying the theorem presented in [59], in the present case where vector *X =* (*ε*_{1}/*σ,ε*_{2}/*σ*,…,*ε _{n}/σ*) is distributed multivariate normal

*N*(

*ø,Ω*). As mentioned in Section 5,

*tr*(

*ΑΩ*)

*= n-*1, thus

*X’ΑX*distributes Chi-square with (

*n−*1) degrees of freedom. In this case,

*Α*is the identity matrix

*I*and

*Ω*is symmetric, singular, and idempotent.

Note that we derive the test assuming that *X* and *Y* are random processes. However, we can apply the test for stationary time series and optionally apply an autoregressive process if we want to remove linear dependence and testing the noncausality between the residuals of the two series.

Finally, the statistics of noncausality SNC(X → Y) and SNC(Y → X) are defined as in (14) and (15).

Note that in practice, computing the statistic is very simple. In summary, the test works as follows:

Step 1: Consider time series {*x _{t}*}

_{t = 1,2,…,T + 2}and {

*y*}

_{t}_{t = 1,2,…,T + 2}we can optionally apply an AR(1) to both series as in (12) and (13) in order to eliminate autocorrelation and define the new residuals time series {

*ux*}

_{t}_{t = 1,2,…,T + 1}and {

*uy*}

_{t}_{t = 1,2,…,T + 1}. Note that 1 observation is lost after applying AR(1).

Step 2: In {*ux _{t}*}

_{t = 1,2,…,T + 1}and {

*uy*}

_{t}_{t = 1,2,…,T + 1}apply a partition in “

*a”*equiprobable regions and translate the series into {

*sx*}

_{t}_{t = 1,2,…,T + 1}and {

*sy*}

_{t}_{t = 1,2,…,T + 1}.

Step 3: According to the two hypothesis, *X → Y* and *Y → X* define the two sets *Sxy =* {(*sx*_{1}*, sy*_{2})*,* (*sx*_{2}*,sy*_{3})*,…,*(*sx*_{t-1}*,sy _{t}*)

*,…,*(

*sx*

_{T},sx_{T + 1})} and

*Syx =*{(

*sx*

_{1}

*,sy*

_{2})

*,*(

*sx*

_{2}

*,sy*

_{3})

*,…,*(

*sx*

_{t-1}

*,sy*)

_{t}*,…,*(

*sx*

_{T}, sx_{T + 1})}.

Step 4: For *Sxy* and Syx*,* compute the frequency of the *n = a*^{2} different events *Exy _{i}/T* and

*Eyx*considering

_{i}/T*i =*1

*,*2

*,…, a*

^{2}.

Step 5: Taking into account Eqs. (14) and (15) compute the *SNC*(*X → Y*) *= nT*{*Σ*[(*Exy _{i}/T*)–(1

*/n*)]

^{2}} and

*SNC*(

*Y → X*)

*= nT*{

*Σ*[(

*Eyx*) − (1/

_{i}/T*n*)]

^{2}}.

Step 6: Finally, two null hypotheses must be contrasted: X does not cause Y, and Y does not cause *X*. In the first case *SNC*(*X → Y*) should be compared with a Chi-2 with *n-*1 degree of freedom at 0.05 of significance, if *SNC*(*X → Y*) is larger than the critical value the null hypothesis is rejected. The same should be done with *SNC*(*Y → X*).

## 7. Symbolic noncausality and Granger noncausality

The concept of causality into the experimental practice is due to Clive Granger. The classical approach of Granger causality is based on temporal properties. Although the principle was formulated for wide classes of systems, the autoregressive modeling framework proposed by Granger was basically a linear model, and as mentioned in [61] the choice was made due to practical reasons. Granger noncausality test is among the most applied tool testing causality. Three limitations should be noted: (1) the classical test has a good performance when the process is linear. This is because it is based on the vector autoregressive model (VAR); (2) there are extension of the classical test to consider nonlinear causality but they are related with a particular nonlinear model; (3) some authors assert that empirical time series are generally contaminated with noise producing what is known as spurious causality or not allowing to detect the causality.

SCN test presented in [60] is a nonparametric noncausality test based on the symbolic time series analysis. The idea is to develop a complementary test to the Granger noncausality, showing strengths in the points where the Granger test is weak. In this sense, the proposed SNC test performs well detecting nonlinear processes, in particular the chaotic processes. In addition, the mentioned problem related with spurious causality should be alleviated. In fact, according to some experiments nonlinear models such as NLAR model, Lorenz map, and models with exponential terms are not detected by Granger test but the SNC identifies these processes. The test is based on information theory considering an approximation of the entropy as the measure of uncertainty of a random variable. Information theory is considered to be a subset of communication theory. However, in [62] is consider that it is much more. It has fundamental contributions to make in statistical physics, computer science, and statistical inference, and in probability and statistics. It is important to highlight and is an important idea relating symbolic analysis, information theory, and the concept of noise. Information theory considers that communication between A and B is a physical process in an imperfect ambient contaminated by noise. Another important concept is the discrete channel, defined as a system consisting of an input alphabet *X* and output alphabet *Y* and a probability transition matrix *p*(*y|x*) that expresses the probability of observing the output symbol *y* given that we send the symbol *x*.

To compare the performance between the classical Granger noncausality and the proposed SNC test, the following stochastic and deterministic models were simulated:

AR(1). We consider two independent series generated by autoregressive (AR) processes:

*X*0.2_{t}=*+*0.45*X*_{t−1}*+ ε*_{1t}and*Y*0.8_{t}=*+*0.5*Y*_{t-1}*+ ε*_{2t}. Where*ε*_{1t}and*ε*_{2t}are i.i.d. and normally distributed (0,1).Nonlinear with exponential component.

*X*1.4_{t}=*–*0.5*X*_{t−1}*e*^{Yt−1}*+ ε*_{1t}and*Y*0.4_{t}=*+*0.23*Y*_{t−1}*+ ε*_{2t}; where*ε*_{1t}and*ε*_{2t}are i.i.d. normal(0,1).NLAR (Autoregressive Nonlinear).

*X*0.2_{t}=*│X*_{t-1}*│/*(2*+ │X*_{t-1}*│*)*+ ε*_{1t}and*Y*0.7_{t}=*│Y*_{t-1}*│/*(1*+ │X*_{t-1}*│*)*+ ε*_{2t}; where*ε*_{1t}and*ε*_{2t}are i.i.d. normal(0,1).Lorenz:

*X*1.96_{t}=*X*_{t−1}−0.8*X*_{t−1}*Y*_{t−1};*Y*0.2_{t}=*Y*_{t−1}*+*0.8*X*^{2}_{t−1}; with initial conditions*X*_{1},*Y*_{1}generated randomly. This is a discrete version of the Lorenz process as in [63].

**Table 2** shows the results of the power experiments applying the SNC and the Granger noncausality test to 10,000 Monte Carlo simulations for the four models and for different sample sizes (*T* = 50, 100, 500, 1000, and 5000).

Following [60], a 60% acceptance or rejection of the null hypothesis is considered as a threshold. SNC and Granger noncausality correctly identifies noncausality in AR(1) process. **Table 2** suggests that SNC is more conservative in the rejection of causality with percentages less than 5%. The nonlinear model with an exponential component implies causality from *Y* to *X*. Note that SNC detects the causality when the sample size is 500 or larger. However, Granger test does not detect causality in any case. As asserted by [58] the NLAR process is very difficult to detect. Note that SCN is the only one detecting the causality when *T* = 5000. The Lorenz discrete map is also chaotic, and it is detected by SNC starting from *T* = 100. However, note that Granger test never detects the causality. In particular, is highlighted that Granger test is not able to detect the model with an exponential component, the NLAR model and the chaotic Lorenz map.

Finally, we compare both tests with real data from US. In particular, we consider two well-known relationships in economics: the Phillips curve [64] about the relation between unemployment and inflation rates, the Okun’s law [65] establishing a relation between unemployment and economic rate. We take annual data for the US unemployment rate, inflation rate, and economic growth for the period 1948–2016 representing a total of 69 observations. **Table 3** shows the results of the Granger noncausality test and the symbolic test considering a partition of two symbols.

Null hypothesis | Granger | SNC(2 symbols) |
---|---|---|

Phillips curve | ||

Unemployment does not cause inflation | 0.04 | 1.53 |

Inflation does not cause unemployment | 16.90* | 9.41* |

Okun’s law | ||

Unemployment does not cause economic growth | 3.37 | 2.94 |

Economic growth does not cause unemployment | 61.01* | 9.65* |

The results are similar for both tests. On one hand, Granger and symbolic tests detect causality from inflation to unemployment in the Phillips curve. On the other hand, the two tests detect causality running from economic growth to unemployment in the Okun’s law. The economic theory suggests that inflation increases unemployment while economic growth reduces it. Note that STSA allows thinking about causality in a more general way, whereas Granger noncausality needs to think of continuous measured variables, this should not be a problem for STSA. Let us consider the following example; we now can test the hypothesis of causality from economic growth (G) and inflation (P) to unemployment (U). The main problem is that we have to test causality from a two-dimensional variable to a one dimensional. Symbolization permits to transform the two-dimensional problem in one dimensional and then to apply the symbolic test as explained. We can follow a similar approach as in [66] where STSA is applied to dynamic regimes. **Figure 2** shows the transformation of the variable (G, P) in a symbolic variable with an alphabet of four symbols (I: low economic growth and low inflation, II: low economic growth and high inflation, III: high economic growth and high inflation, IV: high economic growth and low inflation) considering as partition the mean of each variable. Note that now the application of symbolic causality is easy, the hypothesis that the economic regime (G, P) does not cause unemployment is rejected since the SNC is 31.76 and Chi-2 with 15 degree of freedom (4^{2}*–*1) at 95% is 25.00. The opposite hypothesis is not rejected because the SNC is 24.71. It is not possible to test this type of causality with the traditional Granger noncausality test.

## 8. Conclusion

STSA is a powerful tool being applied to many scientific fields. There are recent applications in robotic, biology, medicine, communication, and engineering. However, applications in social sciences are very recent. The main difficult is the few historical data produced by the social processes. Social sciences are used to applied statistical tests for proving their hypothesis. However, there is much work to do in developing statistical tests based on STSA to be applied in social sciences. There are some very recent efforts applied to economics and finance using STSA. In particular, we present a symbolic independence test, which seems to be powerful in detecting nonlinearities compared with well-known BDS and runs test. The symbolic test is better detecting models such as the chaotic Anosov and Logistic or some stochastic models such as NLMA or NLSIGN. A second symbolic test about causality detects complex processes such as NLAR, nonlinear exponential, or the Lorenz chaotic process when the traditional Granger noncausality cannot. The symbolic causality also enables causality to be tested in a more general perspective. The application of test from a two-dimensional economic variable to a one-dimensional economic variable is a clear example of the potential of STSA in economics and social sciences in general.

One future research line could be to develop a powerful nonlinear test for multidimensional variables. As it was explained, STSA permits to transform a multidimensional time series in a one-dimensional time series simplifying the analysis. This could have important applications in relationships involving vector functions. A more general line of research is to find methodologies to define the optimal partition. As mentioned before, equiprobable partition is generally applied but to find the right partition is still a theoretical and practical weakness in STSA.