Signal Optimal Smoothing by Means of Spectral Analysis

Guido Travaglini

doi:10.5772/66150

Abstract

This chapter introduces two new empirical methods for obtaining optimal smoothing of noise‐ridden stationary and nonstationary, linear and nonlinear signals. Both methods utilize an application of the spectral representation theorem (SRT) for signal decomposition that exploits the dynamic properties of optimal control. The methods, named as SRT1 and SRT2, produce a low‐resolution and a high‐resolution filter, which may be utilized for optimal long‐ and short‐run tracking as well as forecasting devices. Monte Carlo simulation applied to three broad classes of signals enables comparing the dual SRT methods with a similarly optimized version of the well‐known and reputed empirical Hilbert‐Huang transform (HHT). The results point to a more satisfactory performance of the SRT methods and especially the second, in terms of low and high resolution as compared to the HHT for any of the three signal classes, in many cases also for nonlinear and stationary/nonstationary signals. Finally, all of the three methods undergo statistical experimenting on eight select real‐time data sets, which include climatic, seismological, economic and solar time series.

Keywords

signal analysis
white and colored noise
Monte Carlo simulation
time series properties
smoothing techniques

Author Information

Show +

Guido Travaglini*
- University of Rome 1, Rome, Italy

*Address all correspondence to: jay_of_may@yahoo.com

1. Introduction

The literature on time series smoothing and denoising techniques is vast and encompasses different disciplines, such as chemometrics, econometrics, seismology, signal analysis and many more. Among the most renown and utilized methods are the Savitzky‐Golay and Hodrick‐Prescott filters, the Hilbert‐Huang transform (HHT), wavelet analysis, as well as the ample class of kernel filters [1–6].

All of these techniques share an ample spectrum of degrees of resolution loss of the original raw signal, namely, a large family of filters addressed at separating from noise the stochastic or broken trend underlying the observed signal. The researcher is thus enabled to select the desired denoising frequency visually and/or by means of some prior information, like threshold selection, rescaling and data‐dependent smoothing weights. Automatic selection procedures are available in the context of Savitzky‐Golay filtering developed in the field of chemometrics [7] and quite a few in the field of image processing (e.g. [8]). Such procedures are utilized also for the stoppage criteria embedded in the HHT but are virtually absent in the field of econometrics [3, 8–11].

In general, however, the capability of balancing high‐ with low‐resolution characteristics in a world of different real‐time data (RTD) patterns is oftentimes questionable. In fact, in spite of their efficacy, all of these models contribute a common shortcoming by letting the researcher arbitrarily—and thus casually—select the degree of resolution, thereby risking over/under fitting of the slow mode with respect to the original signal. The immediate consequence is suboptimal denoising, namely, a signal extraction yielding a smoother that is statistically inconsistent with the original peak/trough pattern or that too closely replicates the original signal. For this purpose, automated and optimizing parameter selection must be utilized, based on the minimum squared distance between the actual and the smoothed series [12].

The goal of this chapter exactly addresses this problem, which may be defined as the search for statistical dynamic efficiency of the estimated smoother. This search, in practice, ends up with extracting from the available data set the optimal smoother, namely, the stochastic or broken trend, which minimizes the noise variance among many second‐best solutions. Section 2 introduces time series decomposition and the statistical taxonomy of stochastic noise. Sections 3 and 4 are devoted to an introduction and detailed description, respectively, of the HHT and of the SRT models for signal smoothing. Section 5 produces comparative efficiency results of the two new techniques with respect to HHT by using Monte Carlo simulations and reports descriptive statistics of some select climatic, seismological, economic and solar real‐time series. Section 6 concludes, while the Appendix produces the sources of the real‐time signals utilized.

2. Signal decomposition and statistical taxonomy, linearityand stationarity testing

Any observed signal, by means of additive decomposition [3, 13], may be modeled as the sum of three different random variables of unknown distribution such that

Yt=yt*+st+ytE1

where Yt is the time sequence of the real‐valued observations for the discrete‐time notation t∈[1,T], T<∞. Moreover, yt*=E(Yt|Ωt−j) where Ωt−j={Yt−j}jJ is the information set available at t−j, j∈[1,J] for J≤T a maximum preselect lag [14, 15]. The first term of Eq. (1)is the slow‐mode or growth component in the form of a broken trend [16] or of a smoother [3]. The second component st is the periodical seasonal cycle, if any and the last component is the fast‐mode component where E(yt)~IID(0,σy2).

The signal may be linear or nonlinear, as well as stationary or nonstationary, depending on its components and on their distributional properties. Briefly, nonlinearity is a feature of a signal characterized by the presence of linear as well as quadratic and/or cubic variables and/or multiplicative variables, rendering the signal typically nonparametric and thus unsuitable for standard statistical testing such as variance analysis, prediction and stationarity. In addition, a linear or nonlinear signal is stationary (nonstationary) if its best linear trend fit is significantly flat (rising/falling over time). Nonlinearity and nonstationarity of the signal [17] can be tested by means of appropriate procedures [18–21]. The resulting number of regime switches, when applicable, is determined by computing single or multiple intercept and/or trend time breaks [22–24]. For details on the corresponding critical‐value statistics, the reader is redirected to the respective authors.

All components of the observed signal are stochastic and unobservable and must be retrieved from the raw data set by means of appropriate computational methods, some of which were invoked in Introduction. In particular, the slow‐mode term requires the use of some filtering procedures to possibly attain an optimal low‐resolution time‐varying trend, in fact the smoother (Section 3). In addition, if st is a high‐ or low‐resolution phenomenon, it may be entrenched into either or yt* and thus difficult to disentangle from either component.

Any observed signal in Eq. (1) is shown to pertain to one of the following three statistical taxonomic classes: (1) a random Gaussian white noise (RGS), (2) a colored (pink/red) random noise (RWS) and (3) a valid‐threshold regime‐switching Gaussian mixture constructed as a casually ordered sequence of two or more of the previous signals (RGWS).

Formally, the first kind of signal (RGS) may be expressed as follows:

Yt=c+εt+δtE2

where the constant term, which corresponds to the mean of the noise and the time‐slope coefficient, respectively, are c, δ∈(−∞,∞) and {εt}t=1T is a real‐valued univariate white‐noise data sequence such that εt~IID(0,σε2) and σε2<<∞ is a constant. In Eq. (2), if δ=0, the noise is trend‐stationary and the process is a pure RGS with constant mean and variance and expected zero‐autocovariance. Instead, if significantly δ≠0, the noise is trend‐nonstationary and exhibits a rising/falling trend line with white noise superimposed. This result may be substituted into Eq. (1) such that the slow model, and with it the entire signal, would be significantly trended.

The second kind of signal (RWS) is

Yt=c+∑t=1Tεt+δtE3

where the coefficients contribute the same characteristics as those in Eq. (2), but even if δ=0 the noise is trend‐nonstationary and both the variance and the autocovariance are not constant overtime. In consequence, the process, which is an additive white noise, is entirely time dependent in both mean and variance and this affects the overtime pattern of the signal and its components.

The third kind of signal (RGWS) is represented by the Markov regime‐switching model [25–31], where the raw signal may undergo significant structural changes across its lifetime due to variability in its underlying probability transition matrix. Therefore, the signal, which is a combination of Eqs. (2) and (3), may experience some quiet and some turbulent states of nature, like stock‐market fluctuations and seismic signals. The obvious consequence is that the signal may fall short of standard parametric normality or stationarity statistical testing. Moreover, a threshold upper and lower limit must be imposed on regime‐switching dates to avoid spurious computation of clusters too close to each endpoint. The threshold is set in the present context to be 15–85% of the total number of observations with consequential “valid” dates comprised within the reduced‐size sample [24].

The RGWS noise class is represented by the following sequence

Ytim={c1+f1(εt(1),1)+δ1t(1)⋮cm+fm(εt(m),m)+δmt(m)}E4

where m≥2 are the valid state indexes and, for i∈[1,m], the time index t(i) is state‐specific, the expected value E(|δi|)≥0, whereas fi(.) are the nonadditive or additive white‐noise functions, respectively, from Eqs. (2) and (3). From Eq. (4), the typical ith state of the signal is

Yt,i=ci+fi(εt(i),i)+δiti; ∀i∈mE5

where, given an m‐sized vector of exponents represented by a sequence of positive integers 1≤α(i)<<∞, the fast modes of the signal are

{fi(εt(i),i)=(εt(i))α(i), for some 1≤i<mfi(εt(i),i)=(∑t(i)=1T(i)εt(i))α(i), for the remaining 1≤i<mE6

The entire m‐sized sequence of the state functions fi(εt(i),i) may turn to be highly nonlinear if at least one α(i)>1. As a result, the entire signal would give rise in many cases to quirky graphics characterized by enhanced peaks and/or troughs, possibly intermitted by sizably flatter states. Multiple regime switching is expected to be very frequent in such a case, much less so if α(i)≡1, where Eq. (4) would show a more moderate pattern and fewer state gyrations, even zero.

Either of the three taxonomies enables empirical estimation of Eq. (1) by means of any of the smoothing procedures introduced in Section 1. Such procedures in different manners will find the smoother yt* and by default the component yt. In the IID context applied to the noise εt of Eq. (2) to Eq. (4), the random slow and fast components of Eq. (1) are both discrete stochastic processes with zero or nonzero mean and with finite variance and autocovariance. Incidentally, if the slow component is significantly trended (not trended), its autocovariance process is persistent (tapers off slowly over time), whereas the autocovariance of the second component rapidly tapers off over time (is white noise).

Whereas by simple logic the extraction process of the smoother yt* is carried out over the entire timespan of the signal, in the presence of significant regime changes in Eqs. (4) and (5), this procedure might not produce an optimal smoother. Therefore, extraction of the smoother may be applied on a subsample cluster basis rather than by full‐sample computation. Each subsample corresponds to a state (Eq. (4)) and its own smoother is estimated separately from the others. The sequence so obtained is then tested by minimum percentage root mean squared error (PRMSE). In quite, a few real‐time data cases (Section 5), the subsample estimation technique is found to be a better performer.

As a matter of empirical fact, simulation outcomes of Monte Carlo iterations with 1000 draws indicate that around 60% of the RGWS signals with α(i)≡1 in Eq. (5) are nonlinear, while on average, roughly 40% are stationary and almost half of them contain one or more valid regime switches. Nonlinearity and stationarity are very common in the RGWS signals with α(i)>1 in Eq. (5), as they are found in 70% of the corresponding simulations. Finally, on average, less than 1% (10%) of the RGS (RWS) are nonstationary (stationary), whereas in most cases (over 90%), both signals are linear.

3. The Hilbert‐Huang transform (HHT)

The Hilbert‐Huang transform (HHT) is an empirical procedure designed to correct for noise‐ridden signal smoothing, purported to work for both nonlinear and nonstationary time series [4–6, 32, 33]. The HHT is based on the so‐called ensemble empirical mode decomposition (EEMD), which consists, for any arbitrary number of steps 2 ≤ Q ≤ T, of producing at each step an intrinsic mode function (IMF). Each IMF is computed by consecutive “siftings” of the data, a procedure that involves finding the means of the high‐resolution envelopes constructed by cubic splining of the extreme values of the available data, possibly after correcting for endpoints [34]. The Q‐step IMF‐sifting process for a given signal Yt is represented by the following one‐lag adaptive sequence

ht,0=Ytht,1=ht,0−E(ht,0)ht,2=ht,1−E(ht,1)⋯ht,Q=ht,Q−1−E(ht,Q−1)E7

where, given the upper and lower envelopes st,pU, st,pL, p∈[0,Q−1], E(ht,p) is the mean envelope obtained at the end of each sifting process. More generally, from the second line onward, Eq. (7) may be written as follows:

ht,q=ht,q−1−E(ht,q−1), q∈[1,Q]E8

which represents the family of all the sifted IMFs starting from the highest frequency. Subsequently, the matrix HST,Q:(T,Q) of the HHT smoothers is obtained. This is the matrix of the EEMDs. Its rows are expressed as Y˜t,q=ht−q−ht−q−1 from which from which the noise estimates are

ut,q=Y˜t,q−Y˜t,q−1,ut,q~IID(0,σu2).E9

In an HHT environment, the optimal high‐resolution smoother among the candidates of HST,Q may be detected by utilizing the stoppage criterion for sifting, or else by similar means closely akin to the percentage root mean squared error (PRMSE), a much‐used performance index for goodness‐of‐fit purposes. The PRMSEs may be computed as follows:

Sq=(T−1∑t=1Tut,q2∑t=1Ty˜2t,q−1)0.5, ∀q∈QE10

which produce an inverse signal‐to‐noise ratio screeplot of length Q [35, 36]. The procedure for detecting the screeplot global minimizer, denoted as Q*, requires

Pq=‖Sq−2−Sq−1‖‖Sq−1−Sq‖E11

where ‖.‖ is the Euclidean norm of the enclosed argument and Q*=argmax1≤q≤Q(Pq) [37]. There immediately follows that the measured PRMSE, given Q*, is represented by the following formula

S^Q*=(T−1∑t=1Tut,Q*2∑t=1Ty˜t,Q*2)0.5E12

which is the flagship of the efficiency indicators that shall be utilized on confrontational grounds with respect to the two SRTs discussed in Section 4.

4. The spectral representation transform (SRT)

By virtue of the spectral representation theorem, the signal Yt as in Eq. (1) may be approximated by the following De Moivre's formula, as follows:

Yt≃cos(ωt)−r∗sin(ωt)E13

which depicts a harmonic waveform continuous function, where r=−1 and ω are a given arbitrary and constant frequency. There follows that the signal Yt may be defined by the following periodic function

Y^t,k≡µ+∑k=1K[φksin(ωk⋅(t−1))+ϕkcos(ωk⋅(t−1))]E14

where μ is the mean of the signal, k∈[2≤K≤T) for K a maximum lag integer, {φk}k=1K, {ϕk}k=1K are real‐valued random coefficient sequences, both IID with finite mean and variance and {ωk}k=1K is the frequency sequence such that

ωk={T−12πk if ΔYt=f(Yt−1), Limk→0.5T(ωk=π)T−1πk if ΔYt= f(Yt−13), Limk→T(ωk=π) E15

where Δyt=f(.) expresses the existence of linearity and nonlinearity of the process, depending on the exponent attached to the lagged signal level [20, 21]. Needless to say, K generally corresponds to the maximal EMD level contemplated in the HHT method.

The fitted signal of Eq. (14), which is the smoother estimable by ordinary least squares [25], is actually the SRT of the original signal with time‐varying amplitude and lag k. If we let the time series of the prediction error be

et,k=Yt−Y^t,k, ∀k∈KE16

where et,k~IID(0,σe,k2). After producing 1000 Monte Carlo normal simulations of Eq. (14) with T=300, the central limit theorem (CLT), as k→T, is found not to hold asymptotically for RGWS, as is obvious in the presence of regime switches. Elsewise, for RGS and RWS, we have limk→T(Yt=Y^t,k).

Similar to the technique utilized for identifying the optimal smoother in the PRMSE sense exhibited in Section 3, also here a matrix of smoother candidates may be obtained depending on the kth lag chosen. The matrix is defined as SST,K:(T,K) from which the optimal smoother can be selected by applying the optimal lag stopping criterion similar in kind to Eq. (16). However, we utilize here a performance index for model selection different from the PRMSE, which is based on the dynamics of the Hamiltonian optimal control problem [38–40].

If we let et,k be defined as in Eq. (16), then its dynamics is captured by the first‐order autoregressive AR(1) process gt,k=et,k−et−1,k while the dynamics of the smoother is expressed as the AR(1) process ht.k=Y^t,k−Y^t−1,k. We expect both processes to be normally distributed with zero mean and finite variance.

Hence, the Hamiltonian problem may be expressed as follows:

Hk=12∑t=2T{et,k2+gt,k2+ht,k2}, ∀k∈KE17

where the first element within the curly braces is of obvious reading and represents a resolution index that picks up the high frequencies of the problem. The second and the third elements capture the AR(1) dynamics of the processes involved in the Hamiltonian problem, namely, those of the prediction error and those of the signal itself. Eq. (17) is a cubic spline smoother that partly resembles the Hodrick‐Prescott filter in the inclusion of both a cyclical and a trend part [3]. Moreover, Eq. (17) forms the basis for a screeplot [36] similar in kind to that of Eq. (11), whereby the optimal lag K* is obtained after letting

Vk=‖Hk−2−Hk−1‖‖Hk−1−Hk‖E18

such that K*=argmax1≤k≤K(Vk) and wherefrom Y^t,K** is the optimal low‐resolution Hamiltonian‐based smoother, which is named SRT1.

From this smoother the dual high‐resolution SRT smoother, named SRT2, may be easily constructed by means of envelope augmentation, a procedure knowingly embedded into the HHT sifting process (Section 3). Let the upper and the lower envelopes of the actual signal be defined as the unique cubic splines stU, stL constructed by using the extrema of the given data. The upshot is the signal Y^t,K***=12(stU+ stL)∈[stU, stL], such that the second smoother is Y˜t,K**=12(Y^t,K**+Y^t,K***), which is the mean of the two smoothers. The error time series, for the optimal given lag K*, are expressed in a fashion similar to Eq. (9), as follows

{εt,K*=Yt−Y^t,K**,εt,K*:IID(0,σε2)ηt,K*=Yt−Y˜t,K**,ηt,K*:IID(0,ση2)E19

where the first error is associated with SRT1 and the second is associated with SRT2.

From Eq. (19), the PRMSEs of both models may be found by the same means as those employed to obtain Eq. (12), namely

W^K*=(T−1∑t=1Tεt,K*2∑t=1TY^t,K*2)0.5, W˜K*=(T−1∑t=1Tηt,K*2∑t=1TY˜t,K*2)0.5E20

where the first (second) index is associated with SRT1 (SRT2).

5. Smoother analysis applied to artificial and real‐time signals

Figure 1 depicts six signals pertaining to the three taxonomic classes (Section 2). The first three signals are 300 observation artificial RGS, RWS and RGWS, drawn from standard normal distributions. The other three are real‐world signals, of which the first two are recent Japanese earthquake seismographic waveforms collected from a specific web directory [41]. The last signal represents the Qualcomm NASDAQ‐traded stock close price on a weekly basis from the year 1995 to date. In particular, with regard to the first two real‐time signals (Figure 1, panels 4 and 5), the reported seismographs concern two earthquakes, respectively, occurred in Makurazaki (Kagoshima prefecture) on November 13, 2015 at 20:51 UTC, with moment magnitude (Mw) of 6.7 and no damages no victims and in the Tohoku region on March 07, 2011 at 5:46 UTC, with Mw of 9.0, an ensuing tsunami and a reported death toll of 15,893. The waveform data from respective seismogram stations include 108 and 47,200 observations of which only the last 3000 were retained for experimentation.

Figure 1.
Artificial and real‐time random signals, all smoothers and linear trends. Vertical axes represent signal magnitudes, horizontal axes represent time in calendar years (1, 2, 3 and 6) or in seconds (4, 5).

Horizontal (vertical) measurements in Figure 1 represent time lengths (magnitudes). All of the three smoothers (HHT, SRT1 and SRT2) are included together with a linear best‐fit trend. The first two signals (RGS and RWS) are linear while the third (RGWS) is nonlinear (Figure 1, panels 1–3). Among them, only RGS is stationary and none exhibits a valid time break. Of the three real‐time signals (Makurazaki and Tohoku earthquakes, Qualcomm), all are nonlinear and one is nonstationary as well (Tohoku earthquake). Moreover, all of these signals exhibit one single valid time break each, respectively, located at observation 57, 1123 and 261. Finally, from unreported results, SRT2 prevails as the least‐PRMSE smoother for the first four signals, while the quirky behavior of the last two signals (Tohoku and Qualcomm) finds HHT as the most efficient smoother.

Table 1, columns A–C, exhibits the efficiency performance and other coefficients of the three smoothers (HHT, SRT1 and SRT2) for the three signal classes (RGS, RWS and RGWS). The results shown are the mean values obtained from empirical Monte Carlo signal simulations with 1000 draws each and length of 300 observations. The PRMSE coefficients reported in line 1 have received attention in Eqs. (12) and (20), whereas the coefficients reported in line 2 show the simulated mean error variance appearing in their numerators. The variance of the smoothed trends is exhibited in line 3. In all cases, simple eyeballing points to SRT2 as the most efficient, that is, the variance minimizing smoothing method (Table 1, third column of columns A–C).

Tables	A. RGS Monte Carlo simulations, 300 observations each			B. RWS Monte Carlo simulations, 300 observations each			C. RGWS Monte Carlo simulations, 300 observations each
Signal/coefficient	HHT	SRT1	SRT2	HHT	SRT1	SRT2	HHT	SRT1	SRT2
1. PRMSE	0.924	0.973	0.875	0.224	0.228	0.164	0.288	0.355	0.221
2. Error variance	0.854	0.952	0.771	1.633	1.844	0.911	4.459	5.533	1.973
3. Smoother variance	0.924	0.978	0.881	7.290	8.088	5.554	15.482	15.586	8.928
4. Corrected ADF test statistic for stationarity of error variance	-3.138	1.472	-6.851	-4.088	-3.243	-4.067	-1.831	-1.565	-3.104

Table 1.

Monte Carlo simulation mean values of select coefficients for the three signal classes.

Stationarity of the error variances in Eqs. (9) and (19) is crucial for establishing the goodness‐of‐fit of the estimated smoothers of all signal classes. Here, stationarity is tested by means of a novel technique, which corrects the conventional augmented Dickey‐Fuller test statistic [19] after accounting for overtime changes in subsample variances, cycles and growing/falling linear trends in the signal [12]. This technique exhibits a similar nonparametric distribution as the ADF test statistic, such that the critical test statistics for stationarity of the untransformed signal are -2.79 and -2.51 for p‐values of 1 and 5%, respectively. The corrected ADF test statistics of the error variances are reported in Table 1, line 4. In many, if not in most cases, stationarity emerges, yet the error variances generated by SRT2 are significantly stationary for all signal classes, whereas the other methods fail in the particular case of RGWS.

In order to further compare the HHT and the SRT performances, select real‐time signals are being put to empirical testing for the sake of optimal smoothing analysis. The entire data set of the real‐time data (RTD) includes eight signals of different nature: climatic, economic, seismic and solar. The RTD proposed are mostly high resolution and long term, ranging from a minimum of 315 to a maximum of 1915 observations of which one is yearly and all the others are monthly. The last access to the web for all available data was November 30, 2015.

The full source list and related time intervals are contained in the RTD Appendix, while the synthetic acronymed list of the select signals is the following: AMO (Atlantic Multidecadal Oscillation), GISS (Global Land‐Ocean Temperature Index), Yamalia (Yamal Peninsula Temperature Reconstructions), S&P500 (Standard & Poor's 500 Composite Index), SPI (U.S. Standardized Precipitation Index), Banda (Banda Aceh earthquake, 2004), NASDAQ ( Close values of the NASDAQ Stock Index) and finally SSN (Sunspot Numbers).

Among these RTDs, worth of more details than those provided in the Appendix is the earthquake of Banda Aceh, Indonesia, which occurred on December 26, 2014 at 00:58 UTC with Mw of 9.2 and which caused, especially due to an extraordinarily violent tsunami wave, a death toll of an estimated 250,000. The data utilized straddle the foreshock and the main shock tremors, namely, the observations comprised between 12,000 and 15,000 out of a total of recorded waves tallying 58,320 observations.

The relevant descriptive statistics and visual performances of the RTDs are exhibited, respectively, in Table 2 and in Figure 2a and b. From Table 2, a very diverse pattern emerges in terms of mean and standard deviation with estimated volatilities ranging from 1.0 (NASDAQ) to 101.0 (Banda). Moreover, skewness appears mild everywhere, barring two cases where it is relatively high (S&P500 and NASDAQ), whereas kurtosis hovers for all RTDs around its critical value of three. All but two of the RTDs (AMO and SSN) are nonstationary and all exhibit zero valid regime switches, exclusion made for Banda and NASDAQ. Finally, they are all nonlinear except for AMO and SPI, as subsumed from the Harvey linearity test statistic whose critical value is close to 3.0. Optimal smoothing of the RTDs is achieved in the last two cases (Banda and NASDAQ) by means of subsample cluster analysis whereas for the other RTDs full‐sample computation is more efficient (Section 2). The SRT2 smoother is found to exhibit the smallest PRMSEs in 75% of the cases proposed, whereas only three of them prefer the HHT method (S&P500, NASDAQ and SSN).

Signal/Statistic	Number of observations	Mean	Standard deviation	Skewness	Kurtosis	Harvey linearity test statistic	Corrected stationarity test statistic	Number of valid regime switches
AMO	1915	-0.030	0.241	0.046	3.499	2.994	-4.107	0
GISS	1627	1.508	32.994	0.608	2.698	1627.0	27.299	0
Yamalia	356	-0.714	1.790	-0.421	3.230	355.90	5.408	0
S&P500	789	478.650	550.361	1.121	2.977	789.000	83.865	0
SPI	1407	-0.233	1.039	-0.334	2.995	1.751	1.825	0
Banda	3000	584.863	59 × 10³	0.279	2.896	3000.0	21.821	2
NASDAQ	536	1267.57	1263.04	1.048	3.299	536.0	38.445	3
SSN	315	54.365	40.457	0.663	2.440	315.000	-4.748	0

Table 2.

Descriptive and select test statistics of the real‐time dataset (RTD).

Figure 2.
Real‐time select random signals (a) vertical axes represent signal magnitudes, horizontal axes represent time in calendar years) and (b) vertical axes represent signal magnitudes, horizontal axes represent time in years (5, 7 and 8) or seconds (6).

For each RTD, the actual signal, its optimal smoother (HHT or SRT2) and its linear trend are exhibited in Figure 2a and b. In Figure 1, horizontal (vertical) measurements represent time lengths (magnitudes). Among the three HHT optimal smoothers found above, two are related to quirky signals, S&P500 and NASDAQ (Figure 2a, panel 4 and Figure 2b, panel 7). They exhibit however different frequencies, in spite of being estimated by the same subsample cluster technique. In fact the former (latter) is a low‐ (high‐) resolution signal. Both signals exhibit in any case broken trends characterized by a long period of quiet followed by wild gyrations, which reflect the highly varying moods of both the stock market and of the Federal Reserve Board. The third signal associated with optimal smoothing attained through the HHT is the time series of sunspots (SSN). Full‐sample estimation of the smoother was preferred, while broken trends clearly emerge from visual inspection (Figure 2b, panel 8). Significant regime switches are found to occur at the observations 1810 and 1902, in occasion of the Dalton and of the Modern Minimum, respectively [12].

Among the other five RTDs whose optimal smoother is of the SRT2 brand, only one (Banda) requires subsample cluster computation. The optimal smoother exhibits a dramatic regime switch in the passage from foreshocks to the main shocks and somewhat later (Figure 2b, panel 6). The two major break dates were found to be placed at observations 1833 and 1905 of 3000 observations. All of the RTD optimal smoothers, including Banda, exhibit high resolution and, at least visually, manifest considerable accuracy in tracking the original signal. This is particularly true for AMO, GISS, Yamalia and SPI, which are sizably noise‐ridden (Figure 2a, panels 1–3 and 5).

6. Conclusions

This chapter has introduced and described in detail two new dual empirical methods for obtaining optimal smoothing of random signals pertaining to three broad taxonomic classes. Both methods utilize an application of the spectral representation theorem for signal decomposition that exploits the dynamic properties of optimal control. The two methods, named SRT1 and SRT2, produce a low‐ and a high‐resolution filter, which may be utilized for optimal long‐ and short‐run tracking as well as forecasting devices. The methods proven by Monte Carlo simulation found to be more efficient than the empirical Hilbert‐Huang transform (HHT) for all of the taxonomic classes. The methods are also comparatively tested by using random artificial and a bunch of real‐time signals, particularly eight select data sets including climatic, seismological and economic time series. HHT is proven to be more efficient in few cases of quirky, multiple regime‐switch signals, like the Standard & Poor's 500 and the NASDAQ indexes.

References

1. Savitzky A. and Golay M.J.E. Smoothing and differentiation of data by simplified least squares procedures. Analytical Chemistry. 1964; 36: 1627–1639. DOI: 10.1021/ac60214a047
2. Daubechies I. Ten Lectures on Wavelets. SIAM, Society for Industrial and Applied Mathematics, Philadelphia, PA. 1992. p. xix + 357. DOI: 10.1137/1.9781611970104; 10.1137/1.9781611970104#_blank#Opens new window
3. Hodrick R. and Prescott E.C. Postwar U.S. business cycles: an empirical investigation. Journal of Money, Credit and Banking. 1997; 29: 1–16.
4. Huang N.E., Shen Z., Long S.R., Wu M.C., Shih H.H., Zheng Q., Yen N.‐C., Tung C.C. and Liu H.H. The empirical mode decomposition and the Hilbert spectrum for nonlinear and nonstationary time series analysis. Proceedings of the Royal Society in London. 1998; 454A: 903–993. DOI: 10.1098/rspa.1998.0193
5. Huang N.E., Wu M.-L., Qu W., Long S.R., Shen S.S.P., and Zhang J.E. Applications of Hilbert–Huang transform to non-stationary ﬁnancial time series analysis. Applied Stochastic Models in Business and Industry. 2003; 19: 245–268. DOI: 10.1002/asmb.506
6. Huang N.E. and Wu Z. A review on Hilbert‐Huang transform: method and its applications to geophysical studies. Review of Geophysics. 2008; 46: RG2006. DOI: 10.1029/2007RG000228
7. Vivó‐Truyols G. and Schoenmakers P.J. Automatic selection of optimal Savitzky‐Golay smoothing. Analytical Chemistry. 2006; 78: 4598–4608. DOI: 10.1021/ac0600196
8. Xiang Z. and Peyman M. Automatic parameter selection for denoising algorithms using a no‐reference measure of image content. IEEE Transactions on Image Processing. 2010; 19: 3116–3132. DOI: 10.1109/TIP.2010.2052820
9. Baxter M. and King R.G. Measuring business cycles: approximate band‐pass filters for economic time series. Review of Economics and Statistics. 1999; 81: 575–593. DOI: 10.3386/w5022
10. Ravn M.O. and Uhlig H. On adjusting the Hodrick‐Prescott filter for the frequency of observations. Review of Economics and Statistics. 2002; 84: 371–376.
11. McElroy T. Exact formulas for the Hodrick Prescott filter. The Econometrics Journal. 2008; 11: 209–217. DOI: 10.1111/j.1368‐423X.2008.00230.x
12. Travaglini G. An econometric investigation of the sunspot number record since the year 1700 and its prediction into the 22nd century. Advances in Space Research. 2015; 56: 992–1002. DOI: 10.1016/ j.asr. 2015. 05.012.
13. Cramér H. On some classes of nonstationary stochastic processes. Proceedings of the Fourth Berkeley Symposium on Mathematical Statistics and Probability, Berkeley and Los Angeles, University of Los Angeles Press. 1961, pp. 57–77.
14. Ng S. and Perron P. Lag length selection and the construction of unit root tests with good size and power. Econometrica. 2001; 69: 1519–1554. DOI: 10.1111/1468‐0262.00256
15. Su J.‐J., Cheung A.W.‐K. and Roca E. Lag selection of the augmented Kapetanios‐Shin‐Snell nonlinear unit root test. Journal of Mathematics and Statistics. 2013; 9: 102–111. DOI: 10.3844/jmssp.2013
16. Rappoport P. and Reichlin L. On broken trends, random walks and non‐stationary cycles. Technological and Social Factors in Long‐Term Fluctuations. Di Marco M., et al. editors, Siena, IT. Springer-Verlag. 1989, pp. 305–331. DOI: 10.1007/978‐3‐642‐48360‐816
17. Nason G.P. Stationary and nonstationary time series. Ch. 11, Statistics in Volcanology.Mader H. and Coles S.C. editors. The Geological Society, London. 2006, pp. 129–142.
18. Harvey D.I., Leybourne S.J. and Xiao B. A powerful test for linearity when the order of integration is unknown. Studies in Nonlinear Dynamics & Econometrics. 2008; 12: Article 2. DOI: 10.2202/1558‐3708.1582
19. Said E. and Dickey D.A. Testing for unit roots in autoregressive moving average models of unknown order. Biometrika. 1984; 71: 599–607. DOI: 10.1093/biomet/71.3.599
20. Kapetanios G., Shin Y. and Snell A. Testing for a unit root in the nonlinear STAR framework. Journal of Econometrics. 2003; 112: 359–379. DOI: 10.1016/S0304‐4076(02)00202‐6
21. Kapeitanos G. and Shin Y. Testing the null hypothesis of nonstationary long memory against the alternative hypothesis of a nonlinear ergodic model. Econometrics Review. 2011; 30: 620–645. DOI: 10.1080/07474938.2011.553568 DOI:10.1080/07474938.2011.553568#_self
22. Perron P. The great crash, the oil price shock and the unit root hypothesis. Econometrica. 1989; 57: 1361–1401. DOI: 10.2307/1913712
23. Bai J. and Perron P. Computation and analysis of multiple structural change models. Journal of Applied Econometrics. 2003; 18: 1–22. DOI: 10.1002/jae.659
24. Kim D. and Perron P. Unit root tests allowing for a break in the trend function at an unknown time under both the null and alternative hypotheses. Journal of Econometrics. 2009; 148: 1–13. DOI: 10.1016/j.jeconom.2008.08.019
25. Hamilton J.D. A new approach to the economic analysis of nonstationary time series and the business cycle. Econometrica. 1989; 57: 357–384. DOI: 10.2307/1912559
26. Hamilton J.D. Time Series Analysis. Princeton University Press, Princeton, NJ. 1994, pp. xiv + 799. DOI: 10.1017/S0266466600009440; 10.1017/S0266466600009440#_blank
27. Hamilton J.D. Regime switching models. The New Palgrave Dictionary of Economics, 2nd ed., Durlauf S.N. and Blume L.E. editors. Palgrave Macmillan, London. 2008. DOI: 10.1057/9780230226203.1411
28. Kanas A. Purchasing power parity and Markov regime switching. Journal of Money Credit and Banking. 2006; 38: 1669–1687. DOI: 10.1353/mcb.2006.0083
29. Mizrach B. Nonlinear time series analysis. The New Palgrave Dictionary of Economics, 2nd ed., Durlauf S.N. and Blume L.E. editors. Palgrave Macmillan, London, UK. 2008, pp. 169–177.
30. Mizrach B. Nonlinear mean reversion in EMS exchange rates. Brussels Economic Review. 2010; 53: 187–198.
31. Lee H.‐T. and Yoon G. Does purchasing power parity hold sometimes? Regime switching in real exchange rates. Applied Economics. 2013; 45: 2279–2294. DOI: 10.1080/00036846.2012661399
32. Wu Z., Huang N.E., Long S.R. and Peng C.‐K. On the trend, detrending and variability of nonlinear and nonstationary time series. PNAS. 2007; 104: 14889–14894. DOI: 10.1073/ pnas.0701020104
33. Wu Z. and Huang N.E. Ensemble empirical mode decomposition: a noise‐assisted data analysis method. Advances in Adaptive Data Analysis. 2009; 1: 1–41. DOI: 10.1142/S1793536909000047
34. Dätig M. and Schlurmann T. Performance and limitations of the Hilbert‐Huang transformation (HHT) with an application to irregular water waves. Ocean Engineering. 2004; 31: 1783–1834.
35. Jolliffe I.T. A note on the use of principal components in regression. Applied Statistics. 1982; 31: 300–303.
36. Jolliffe I.T. Principal Component Analysis, 2nd ed., Springer‐Verlag, New York. 2002, pp. xxix + 427.
37. Ahn S.C. and Horenstein A.R. Eigenvalue ratio test for the number of factors. Econometrica. 2013; 81: 1203–1227. DOI: 10.3982/ECTA8968
38. Miller R.E. Dynamic Optimization and Economic Applications. McGraw‐Hill, New York, NY. 1979, pp. vii + 332.
39. Athans M. and Falb P. Optimal Control: An Introduction to the Theory and Its Applications. Dover Publications, New York. 2007, pp. xvi + 879.
40. Ross I.M. A Primer on Pontryagin's Principle in Optimal Control. Collegiate Publishers, San Francisco, CA. 2009, p. 102.
41. IRIS, Incorporated Research Institutions for Seismology, 2015. Available from: https:/ds.iris.edu.ds.nodes.dmc [Accessed: 2016‐09‐11].

[1] 1. Savitzky A. and Golay M.J.E. Smoothing and differentiation of data by simplified least squares procedures. Analytical Chemistry. 1964; 36: 1627–1639. DOI: 10.1021/ac60214a047

[2] 2. Daubechies I. Ten Lectures on Wavelets. SIAM, Society for Industrial and Applied Mathematics, Philadelphia, PA. 1992. p. xix + 357. DOI: 10.1137/1.9781611970104; 10.1137/1.9781611970104#_blank#Opens new window

[3] 3. Hodrick R. and Prescott E.C. Postwar U.S. business cycles: an empirical investigation. Journal of Money, Credit and Banking. 1997; 29: 1–16.

[4] 4. Huang N.E., Shen Z., Long S.R., Wu M.C., Shih H.H., Zheng Q., Yen N.‐C., Tung C.C. and Liu H.H. The empirical mode decomposition and the Hilbert spectrum for nonlinear and nonstationary time series analysis. Proceedings of the Royal Society in London. 1998; 454A: 903–993. DOI: 10.1098/rspa.1998.0193

[5] 5. Huang N.E., Wu M.-L., Qu W., Long S.R., Shen S.S.P., and Zhang J.E. Applications of Hilbert–Huang transform to non-stationary ﬁnancial time series analysis. Applied Stochastic Models in Business and Industry. 2003; 19: 245–268. DOI: 10.1002/asmb.506

[6] 6. Huang N.E. and Wu Z. A review on Hilbert‐Huang transform: method and its applications to geophysical studies. Review of Geophysics. 2008; 46: RG2006. DOI: 10.1029/2007RG000228

[7] 7. Vivó‐Truyols G. and Schoenmakers P.J. Automatic selection of optimal Savitzky‐Golay smoothing. Analytical Chemistry. 2006; 78: 4598–4608. DOI: 10.1021/ac0600196

[8] 8. Xiang Z. and Peyman M. Automatic parameter selection for denoising algorithms using a no‐reference measure of image content. IEEE Transactions on Image Processing. 2010; 19: 3116–3132. DOI: 10.1109/TIP.2010.2052820

[9] 9. Baxter M. and King R.G. Measuring business cycles: approximate band‐pass filters for economic time series. Review of Economics and Statistics. 1999; 81: 575–593. DOI: 10.3386/w5022

[10] 10. Ravn M.O. and Uhlig H. On adjusting the Hodrick‐Prescott filter for the frequency of observations. Review of Economics and Statistics. 2002; 84: 371–376.

[11] 11. McElroy T. Exact formulas for the Hodrick Prescott filter. The Econometrics Journal. 2008; 11: 209–217. DOI: 10.1111/j.1368‐423X.2008.00230.x

[12] 12. Travaglini G. An econometric investigation of the sunspot number record since the year 1700 and its prediction into the 22nd century. Advances in Space Research. 2015; 56: 992–1002. DOI: 10.1016/ j.asr. 2015. 05.012.

[13] 13. Cramér H. On some classes of nonstationary stochastic processes. Proceedings of the Fourth Berkeley Symposium on Mathematical Statistics and Probability, Berkeley and Los Angeles, University of Los Angeles Press. 1961, pp. 57–77.

[14] 14. Ng S. and Perron P. Lag length selection and the construction of unit root tests with good size and power. Econometrica. 2001; 69: 1519–1554. DOI: 10.1111/1468‐0262.00256

[15] 15. Su J.‐J., Cheung A.W.‐K. and Roca E. Lag selection of the augmented Kapetanios‐Shin‐Snell nonlinear unit root test. Journal of Mathematics and Statistics. 2013; 9: 102–111. DOI: 10.3844/jmssp.2013

[16] 16. Rappoport P. and Reichlin L. On broken trends, random walks and non‐stationary cycles. Technological and Social Factors in Long‐Term Fluctuations. Di Marco M., et al. editors, Siena, IT. Springer-Verlag. 1989, pp. 305–331. DOI: 10.1007/978‐3‐642‐48360‐816

[17] 17. Nason G.P. Stationary and nonstationary time series. Ch. 11, Statistics in Volcanology.Mader H. and Coles S.C. editors. The Geological Society, London. 2006, pp. 129–142.

[18] 18. Harvey D.I., Leybourne S.J. and Xiao B. A powerful test for linearity when the order of integration is unknown. Studies in Nonlinear Dynamics & Econometrics. 2008; 12: Article 2. DOI: 10.2202/1558‐3708.1582

[19] 19. Said E. and Dickey D.A. Testing for unit roots in autoregressive moving average models of unknown order. Biometrika. 1984; 71: 599–607. DOI: 10.1093/biomet/71.3.599

[20] 20. Kapetanios G., Shin Y. and Snell A. Testing for a unit root in the nonlinear STAR framework. Journal of Econometrics. 2003; 112: 359–379. DOI: 10.1016/S0304‐4076(02)00202‐6

[21] 21. Kapeitanos G. and Shin Y. Testing the null hypothesis of nonstationary long memory against the alternative hypothesis of a nonlinear ergodic model. Econometrics Review. 2011; 30: 620–645. DOI: 10.1080/07474938.2011.553568 DOI:10.1080/07474938.2011.553568#_self

[22] 22. Perron P. The great crash, the oil price shock and the unit root hypothesis. Econometrica. 1989; 57: 1361–1401. DOI: 10.2307/1913712

[23] 23. Bai J. and Perron P. Computation and analysis of multiple structural change models. Journal of Applied Econometrics. 2003; 18: 1–22. DOI: 10.1002/jae.659

[24] 24. Kim D. and Perron P. Unit root tests allowing for a break in the trend function at an unknown time under both the null and alternative hypotheses. Journal of Econometrics. 2009; 148: 1–13. DOI: 10.1016/j.jeconom.2008.08.019

[25] 25. Hamilton J.D. A new approach to the economic analysis of nonstationary time series and the business cycle. Econometrica. 1989; 57: 357–384. DOI: 10.2307/1912559

[26] 26. Hamilton J.D. Time Series Analysis. Princeton University Press, Princeton, NJ. 1994, pp. xiv + 799. DOI: 10.1017/S0266466600009440; 10.1017/S0266466600009440#_blank

[27] 27. Hamilton J.D. Regime switching models. The New Palgrave Dictionary of Economics, 2nd ed., Durlauf S.N. and Blume L.E. editors. Palgrave Macmillan, London. 2008. DOI: 10.1057/9780230226203.1411

[28] 28. Kanas A. Purchasing power parity and Markov regime switching. Journal of Money Credit and Banking. 2006; 38: 1669–1687. DOI: 10.1353/mcb.2006.0083

[29] 29. Mizrach B. Nonlinear time series analysis. The New Palgrave Dictionary of Economics, 2nd ed., Durlauf S.N. and Blume L.E. editors. Palgrave Macmillan, London, UK. 2008, pp. 169–177.

[30] 30. Mizrach B. Nonlinear mean reversion in EMS exchange rates. Brussels Economic Review. 2010; 53: 187–198.

[31] 31. Lee H.‐T. and Yoon G. Does purchasing power parity hold sometimes? Regime switching in real exchange rates. Applied Economics. 2013; 45: 2279–2294. DOI: 10.1080/00036846.2012661399

[32] 32. Wu Z., Huang N.E., Long S.R. and Peng C.‐K. On the trend, detrending and variability of nonlinear and nonstationary time series. PNAS. 2007; 104: 14889–14894. DOI: 10.1073/ pnas.0701020104

[33] 33. Wu Z. and Huang N.E. Ensemble empirical mode decomposition: a noise‐assisted data analysis method. Advances in Adaptive Data Analysis. 2009; 1: 1–41. DOI: 10.1142/S1793536909000047

[34] 34. Dätig M. and Schlurmann T. Performance and limitations of the Hilbert‐Huang transformation (HHT) with an application to irregular water waves. Ocean Engineering. 2004; 31: 1783–1834.

[35] 35. Jolliffe I.T. A note on the use of principal components in regression. Applied Statistics. 1982; 31: 300–303.

[36] 36. Jolliffe I.T. Principal Component Analysis, 2nd ed., Springer‐Verlag, New York. 2002, pp. xxix + 427.

[37] 37. Ahn S.C. and Horenstein A.R. Eigenvalue ratio test for the number of factors. Econometrica. 2013; 81: 1203–1227. DOI: 10.3982/ECTA8968

[38] 38. Miller R.E. Dynamic Optimization and Economic Applications. McGraw‐Hill, New York, NY. 1979, pp. vii + 332.

[39] 39. Athans M. and Falb P. Optimal Control: An Introduction to the Theory and Its Applications. Dover Publications, New York. 2007, pp. xvi + 879.

[40] 40. Ross I.M. A Primer on Pontryagin's Principle in Optimal Control. Collegiate Publishers, San Francisco, CA. 2009, p. 102.

[41] 41. IRIS, Incorporated Research Institutions for Seismology, 2015. Available from: https:/ds.iris.edu.ds.nodes.dmc [Accessed: 2016‐09‐11].

Signal Optimal Smoothing by Means of Spectral Analysis

Advances in Statistical Methodologies and Their Application to Real Problems

Abstract

Keywords

Author Information

Guido Travaglini*

1. Introduction

2. Signal decomposition and statistical taxonomy, linearityand stationarity testing

3. The Hilbert‐Huang transform (HHT)

4. The spectral representation transform (SRT)

5. Smoother analysis applied to artificial and real‐time signals

Figure 1.

Table 1.

Table 2.

Figure 2.

6. Conclusions

References

Information‐Theoretic Clustering and Algorithms

Signal Optimal Smoothing by Means of Spectral Analysis

Advances in Statistical Methodologies and Their Application to Real Problems

Abstract

Keywords

Author Information

Guido Travaglini*

1. Introduction

2. Signal decomposition and statistical taxonomy, linearityand stationarity testing

3. The Hilbert‐Huang transform (HHT)

4. The spectral representation transform (SRT)

5. Smoother analysis applied to artificial and real‐time signals

Figure 1.

Table 1.

Table 2.

Figure 2.

6. Conclusions

References

Continue reading from the same book

Advances in Statistical Methodologies and Their Application to Real Problems