Nonlinear Analysis of Surface EMG Signals

In Section 2, we discuss the two methods of nonlinear time series test: surrogate data test method and Volterra-Wiener-Korenberg (VWK) model test method. Theoretically, the two methods can detect the nonlinearity of the data indirectly. The surrogate data method is used to analyze the SEMG. The result shows that the SEMG has deterministic nonlinear components. Meanwhile, we introduce the VWK model test method and compare it with the surrogate data method. The nonlinearity of SEMG during muscle fatigue can be detected by the VWK.


Introduction
The aim of this chapter is to answer the essence of SEMG and to explore the potential use of nonlinear analysis as a tool in the clinical and biomechanical applications. The technical tools include nonlinear time series test, time series analysis based on chaos theory, multifractal analysis.
In Section 2, we discuss the two methods of nonlinear time series test: surrogate data test method and Volterra-Wiener-Korenberg (VWK) model test method. Theoretically, the two methods can detect the nonlinearity of the data indirectly. The surrogate data method is used to analyze the SEMG. The result shows that the SEMG has deterministic nonlinear components. Meanwhile, we introduce the VWK model test method and compare it with the surrogate data method. The nonlinearity of SEMG during muscle fatigue can be detected by the VWK.
In Section 3, we describe the time series analysis based on chaos theory. The chaos definition and chaotic characteristics are discussed. The embedding theory of the attractor reconstruction is investigated for the dynamical system. From the view of the fractal structure of the chaotic attractor, the correlation dimension is used to test the chaotic characteristics of the SEMG during arm movements. The Largest Lyapunov exponent is also studied. Then, we investigate the influence of measure noise, internal noise and sampling interval on the principal components of chaotic time series. The symplectic principal component analysis is given. We illustrate the feasibility of this method and give the embedding dimension of the action surface EMG signal. In Section 4, the self-affine fractal definition and nature are described. The power spectrum and frequency relationship is used to calculate the self-affine fractal dimension of the time series, such as SEMG. Then, the multifractal dimension is given for the SEMG.
The conclusion and future research are shown in Section 5. Here, it is necessary to note that this chapter is actually the result of many years work. The new methods presented here build on a broad and strong foundation of nonlinear time series analysis and chaotic dynamical theory.

Detecting nonlinearity of the surface EMG signals
In many areas of science and engineering, it is a critical issue to determine whether an observed time series is purely stochastic, or deterministic nonlinear, even chaotic. One may know about the intrinsic properties of the observed phenomenon by distinguishing between nonlinear deterministic dynamics and noisy dynamics from a time series. In this section, we review and discuss the surrogate data test method [1] and Volterra-Wiener-Korenberg (VWK) model test method [2] for identifying the nonlinearity of a time series. These methods have been successfully used to detect and characterize nonlinear dynamics from recordings in biology and medicine [2][3][4][5].
Surrogate analysis is currently an important empirical technique of testing nonlinearity for a time series. The aim is to test whether the dynamics are consistent with linearly filtered noise or a nonlinear dynamical system [1,6]. The basic idea of the surrogate data method is to first specify some kind of linear stochastic process as a null hypothesis that mimics "linear properties" of the original data. According to the null hypothesis, surrogate data sets are generated. Then, a discriminating statistic is calculated for the original and for each of the surrogate data sets. If the statistic of the original data is significantly different from those of surrogate data sets, the null hypothesis can be rejected within a certain confidence level. It shows that the original data is from a nonlinear dynamical system. The method is demonstrated for numerical time series generated by known chaotic systems and applied to the analysis of SEMG.
VWK test method is a kind of nonlinear detection of time series based on linear and nonlinear Volterra-Wiener-Korenberg model [2,5]. That is, it first produces the linear and nonlinear predicted data from the original time series and then compares their information criterions to detect the nonlinearity of the raw data. VWK test technique is capable of robust and highly sensitive statistical detection of deterministic dynamics, including chaotic dynamics, in experimental time series. This method is superior to other techniques when applied to short time series, either continuous or discrete, even when heavily contaminated with noise or in the presence of strong periodicity. Later, an extension of the Volterra algorithm (called the numerical titration algorithm) was given to detect and quantify chaos in noisy time series [7]. Here, the surrogate data method and VWK test approach are used to analyze the nonlinearity of surface EMG signals. , ⋅ denotes an average over time t. That is, In order to generate the surrogate data that is consistent with this hypothesis, the algorithm is that one first calculates the mean μ , variance γ and autocorrelation A(1) (in Eq. (2)) from the original data x. Then, the coefficients in Eq. (1) can be estimated: There are the two algorithms to produce the surrogate data in accord with this hypothesis.
One algorithm is to directly use Eq.3. That is, the coefficients are firstly identified by using the original data. Then, the surrogate data is generated by repeatedly iterating Eq.3. However, the performance of this algorithm is very unstable. If the values of the coefficients are mis-estimated slightly, this algorithm may lead to the iterative results which easily diverge to infinity. The alternative algorithm is that a surrogate data is generated by randomizing the phases of a Fourier transform. According to the Weiner-Khintchine theorem, the two algorithms are equivalent in essence [1,10]. The surrogate data has the same Fourier spectrum as the original data. Meanwhile, the algorithm based on the Fourier transform is stabler in the numerical calculation than the first algorithm. The following is the steps of this algorithm.
Let an observed data as ) (n x . The Fourier transform of ) (n x is computed as follows [3]: The Fourier transform has a complex amplitude at each frequency. One can randomize the phases of the Fourier transform by multiplying ϕ i e , where ϕ is independently chosen for each frequency from the [0, 2 π ]. In order to the inverse Fourier transform to be real (no imaginary components), the phases must satisfy the antisymmetric condition ( ) ( ) This point can be easily proved [11]. Proof: According to the nature of DFT of a real time series , then where ) (k φ is the phase angle of ) (k X . k = 0, 1, … N-1. N is the period of Fourier Transform. Then, for k = 0, k = N /2 (N is even), there are In order to ensure that the inverse Fourier transform results are real values, there must be In practical, if the data length N is odd, ϕ(f1)=0, ϕ(fi) =-ϕ(fk), i=2~(N+1)/2, k=N~(N+1)/2+1; If N is even, ϕ(f1)=0, ϕ(fN/2+1)=0, ϕ(fi)=-ϕ(fk), i=2~N/2, k=N~N/2+2. Thus, the surrogate data x' (n) given by the inverse Fourier transform is a sequence of real numbers. Thus, there are no imaginary components (see Fig. (1a)). The values of the imaginary parts are very little (magnitude 10 -14 ) so that they can be regarded as computing precision errors.
The surrogate data has the same Fourier transform spectrum as the original data by using this algorithm. The reproduced "pure" frequencies are very well. Fig.(1b)shows that the previous FT algorithm [1] cannot make the imaginary components of Fourier inverse transform to be zero. So, if one only uses the real part of Fourier inverse transform as surrogate data and omit its imaginary components, the obtained surrogates would have the two limitations [1,12].

Null hypothesis 4
The observed data is produced by the static nonlinear transform of linear gaussian process.
The static nonlinear transform is that the observation or measure function is nonlinear. The static means that the measure data xt only depends on the state yt of the dynamic process at the time t , not on derivatives or values in the past. Let h be a measure function, then The generated surrogate data not only contain the linear correlated characteristic, but also can reflect the static, monotonic nonlinearity of the original data. Strictly speaking, time series in this class are nonlinear. But this nonlinearity is not from the dynamics. This hypothesis can be used to indicate whether the nonlinearity is from the dynamical system or the amplitude distribution (i.e. the measure process).
For generating surrogate data corresponding to this null hypothesis, an algorithm is described. The aim is to shuffle the time-order of the data xt and to preserve the linear correlations of the underlying time series yt = h -1 (xt). The first step is to make a Gaussian time series yt, where each element is generated independently from a Gaussian pseudorandom number generator. Next, we rescale yt in accordance with the time-order of the original data xt. The re-ordered yt has a time series which "follows" the static, monotonic nonlinearity of the original data. Then, the data t y ′ is created by using the above FT algorithm to deal with the re-ordered yt. Finally, the raw data xt is rescaled in terms of the time-order of the data t y ′ to generate the surrogate data t x′ . The "underlying" time series (yt and t y ′ ) are Gaussian and have the same Fourier power spectrum. The produced t x′ matches the amplitude distribution of the raw data xt.

Test statistics[6]
The test statistic is a value which estimates some certain aspects of the time series. To compare the raw data to its surrogate data sets, a suitable test statistic must be selected. A useful statistic should be pivotal and independent of the way that surrogate data sets are generated. In other words, for every data set z and every realization zi of any Fi∈F φ , their test statistics should be different, i.e.
where F φ represents the null hypothesis process. Meanwhile, the distribution of T under the null hypothesis does not depend on μ or σ. Here we give two discriminating statistics as follows: where "¯" denotes the average of the data. The mean μ and varianceσ have no effect on the T value in Eq. (13). Therefore, some linear structure characteristics can be determined except for the mean and variance. The T value in Eq. (14) can judge if the surrogate data are consist with the raw data in the view of the correlation with the mean and variance. The T value in Eq. (15) is a simple skewed difference statistic that is both rapidly computable and often quite powerful.
where ⋅ is mean operator, m is time delay. This statistic T provides a more significant rejection of the hypothesis of the static nonlinear filter of an underlying linear process. Informally, this statistic indicates the asymmetry between rise and fall times in the time series.
The surrogate data method is suitable to detect the nonlinearity of a short, noisy time series. Here, a Gaussian data and a Logistic chaotic time series are used to study the performance of surrogate data method. For a two-sided test, the probability of rejecting the null hypothesis is given by the confidence level p, the surrogate data sets B must be at least as large as . For 95% confidence level, there should be 39 sets of surrogate data.
A Gaussian data x is a random time series with zero mean and unit variance produced by the pseudorandom generator. The data length is 1000 points. According to the null hypothesis 3, 39 sets of surrogate data are generated by using our above FT algorithm. The T value is calculated by Eq. (13) and Eq. (14), respectively. In the Figure (2a and b), there are no statistical discrepancy between the test statistic T of the raw data x and those of surrogate data.
The statistic T values of the raw data are on the range of the empirical distribution of T given by the surrogate data. The results show that the generated surrogate data has the same Fourier transform spectrum as the raw data besides the same mean and variance as the raw data because the T value in Eq. (14) is a measure of the time irreversibility of the data. The null hypothesis 3 is accepted at the confidence level 95%. The raw data is consistent with the stochastic process of the null hypothesis 3. The surrogate data produced by the above FT algorithm is equivalent to the raw data. The generated surrogate data reflects the null hypothesis 3. To further test that the surrogate data based on the above FT algorithm can be used to detect nonlinearity of a time series, we apply the Logistic chaotic system to produce a chaotic time series as follows.
, e is white noise with mean 0, variance 0.001 2 . If e takes part in the evolution process of the above equation, it is called as interior noise (or dynamic noise); otherwise it is called as measure noise.
a. The length N=5000 of the raw time series b. The length N=500 of the raw time series For the case without noise e = 0, we use Eq. (16) to compute the two Logistic chaotic time series with the length of 5000 points and 500 points. 39 surrogate data generated by the above FT algorithm contain the linear properties of the original data in terms of the null hypothesis 3. In Fig.3, we can see the obvious difference between the original data and its surrogate data, regardless of the length of 5000 points or 500 points. The null hypothesis can be rejected in 95% confidence level. The original data has nonlinear components. The results show that the data length has little effect on the surrogate data method based on the above FT algorithm. In Figure 4, we study the nonlinear test of Logistic chaotic time series with measure noise and interior noise, respectively. The data length is 1000 points. According to the null hypothesis, 39 sets of surrogate data are generated. The statistic T for the original data is significantly different from the values obtained for the surrogate data sets. The null hypothesis 3 can also be rejected in 95% significance. The nonlinearity of the original data can be detected. To sum up, the length and noise has no impact on the surrogate data method based on our FT algorithm.  [3,13,14] The nature of SEMG plays an important role in neuromuscular disorder diagnosis, muscle fatigue monitoring, prosthesis control, etc. Here the analyzed data are collected from physiological instruments. Humid surface electrode and AD12-16LG collecting card of physiology signal are used in the whole experiment that was done at Hua Shan Hospital in Shanghai. The data are sampled at 1kHz for the action surface EMG (ASEMG) [3]and the fatigue surface EMG (FSEMG) when one hand carries a 1kg heavy thing [15](see Fig. 5). The length of data for ASEMG is 1000 points during the beginning of action because this time span contains the information of the forearm movement. In the case of carrying a 1kg heavy thing, the length of FSEMG data is also 1000 points when the arm has been fatigue.
For these surface EMG signals, 39 surrogate data are produced by the null hypothesis 2. The surrogate data analysis is given for the action surface EMG signal and the fatigue surface EMG signal, respectively(see Fig. 6). The results show that for action surface EMG signal and fatigue surface EMG signal, their T values are obviously different from those of surrogate data in terms of Eq.13. The null hypothesis 2 can be rejected in 95% degree of confidence. The action surface EMG signal and fatigue surface EMG signal is not stochastic signal produced by a linear stochastic process, but contains nonlinear components. However, this result could not ensure that this nonlinearity must be from the dynamic system.
a. a typical action surface EMG wave b. a typical fatigue surface EMG wave    In order to test that the nonlinear components are intrinsic deterministic, we further assume that ASEMG is stochastic signal consistent with the null hypothesis 4. Fig.7 gives the T values of ASEMG and surrogate data calculated by Eq. 15. This statistic indicates the asymmetry between rise and fall times in the time series. From this figure, we can see that there is the difference between data and surrogates, and the null hypothesis 4 is rejected in 95% credibility. This result shows that the nonlinearity of ASEMG is intrinsic and deterministic.

Volterra-Wiener-Korenberg test model[2]
For a dynamic system, an observed time series where the memory k and combination degree d correspond to the embedding dimension and the degree of nonlinearity of the model, respectively. The coefficients am are recursively estimated through a Gram-Schmidt procedure from linear and nonlinear autocorrelations of the data itself with a total dimension M=(k+d)!/(d!k!).
There is the following information criterion in accordance with the parsimony principle:  For each data series, there is the following numerical procedure to search for the optimal pair {kopt, dopt}: 1. when d=1, search for kopt which minimizes C(r). 2. with k=kopt, increasing d>1, search for dopt which minimizes C(r).
3. calculate C lin (r) with d=1 and k=M-1, and C nl (r) with d=dopt and k=kopt. 4. Compare C lin (r) and C nl (r), if C nl (r) is obviously smaller than C lin (r), then the original system dynamics is nonlinear, the obtained time series is nonlinear, even chaos; otherwise, the original system dynamics is linear, the raw data is linear.
Note that when kopt is rather large, M is quite large, too, then the computational time will rapidly go up. In this case, k and d should be adjusted synchronously to search for kopt and dopt so as to make C nl (r) < C lin (r). Furthermore, one can obtain the corresponding linear and nonlinear models for surrogate data generated by the FT algorithm according to the null in the statistical sense. [15,16] Here, the VWK method is used to deal with the surface EMG signals in Fig.5. For the action surface EMG signal, C lin (r) is almost equal to C nl (r), i.e.

Analysis of SEMG based on VWK method
, this technique can hardly detect its nonlinearity (see Fig.8a). For the fatigue surface EMG signal, C nl (r) is distinctly smaller than C lin (r) so that its nonlinear component can be detected (see Fig.8b). So VWK technique can effectively detect the nonlinear dynamic speciality of fatigue surface EMG signal but fails to test the nonlinearity of the action surface EMG signal. In other words, VWK technique can not be used directly to deal with the action surface EMG signal.
a. The action surface EMG signal b. The fatigue surface EMG signal   [16] In order to detect the nonlinearity of the action surface EMG signal, 39 FT-based surrogate data are used according to the null hypothesis 3. The generated surrogate data contain the linear properties of the raw data. Figure 9 is the analysis of surface EMG signal based on VWK with surrogate data. We can see that no matter whether it is the action or fatigue EMG signal, ) (r C nl orig is always smaller than ) (r C nl surr . The null hypothesis 3 can be rejected in 95% significance. The results illuminate that the action and fatigue surface EMG signals contain nonlinear dynamic properties.

Analysis of the surface EMG signals based on chaos theory
The discovery of chaotic phenomena is the third major breakthrough in the 20th century physics scientific community following the creation of relativity and quantum mechanics. It organically combines the two major theoretical systems of determinism and probabilism that have long been debated to create a scientific model of a new paradigm, so that people can use some simple rules to explain seemingly stochastic information in the past [17][18][19]. The practical significance that finds chaotic phenomena is to recognize that a deterministic nonlinear system can have inherent uncertainty. Perhaps a system has only a few degrees of freedom, but it can produce complex, similar to the random output signal. In the past, one could only denote a random-looking data as a random process from the view of the traditional time series analysis. The statistical methods or random time series models were used to analyze the data. Since the chaotic phenomenon was discovered by Lorenz [17], people have begun to reunderstand and restudy these random-looking signals so as to reveal the inherent deterministic mechanisms of these signals. That is, it is to explore that the systems which generate these signals may contain essentially deterministic characteristics. Chaos phenomenon breaks the path that the regularity is found in a lot of completely different systems. This will lead to a revolution in the field of influence of various disciplines. It is chaos to lead people to explore the complexity in nature.
At present, the idea of Chaos has been introduced into the analysis of time series to create the field of chaotic time series analysis. Since the inception of chaotic time series analysis, it has quickly been penetrated into other disciplines and engineering fields. Thus it becomes the most active branch of the modern nonlinear dynamics. This section describes the chaos definition and the phase space reconstruction of chaotic time series, discusses some parameters that are used to analysis chaotic time series, such as the correlation dimension and Lyapunov exponent, study the principal component analysis methods based on SVD, and propose the symplectic principal component method based on symplectic geometry. Then we use these methods to investigate the surface EMG signals.

Chaos and its definition
Chaos is "order in disorder". The order means its deterministic nature. The disorder means that the final results can be unpredictable for a long time. As a scientific concept, chaos generally denotes that the long-term dynamical behavior of a deterministic nonlinear system manifests as a random-looking behavior. Mathematically speaking, "chaos" has not been a unified strict definition. For the definition of chaos, there are at least nine different definitions, where the three definitions given by Li-Yorke, Devaney, Marotto are more commonly used. Here describes the definition of chaos by Li-Yorke [18].

Li-Yorke Theorem:
Let has a periodic point with period 3, then for any positive integer n=1, 2, 3, …, there is a periodic point with period n. This is the famous period 3 theorem. It becomes a milestone in the development history of chaos theory and promotes the creation and development of chaos theory. From this theorem, the first formal mathematical definition of the chaos is given.

Chaos definition: Let
) (x f as a continuous self-map in closed interval I, i.e. where If Eq.20 satisfies the following conditions, then it has chaotic motion: 1. The period of periodic point of f has no upper bound. 2. There is an uncountable set I S ⊂ , which satisfies the following conditions: This definition explains "existence" of chaos in mathematics. According to the above theorem and the definition, the description of chaotic motion is different from the general periodic and quasi-periodic motion. Its motion is not a single periodic orbit but an envelope for a bunch of tracks, where the infinite number of countable stable periodic orbits and uncountable stable aperiodic orbits are embedded densely. Meanwhile, there is at least one unstable aperiodic orbit. Overall, the chaos not only contains some inherent regularity, but also shows that the system has ergodicity. That is, the system has a long-term unpredictability. In other words, the long-term behavior of the system can not be predicted if the system displays the so-called "sensitive dependence on initial conditions". The meaning of this definition is that the aperiodicity of chaotic system is exhibited accurately. For a dynamical system, the observable behaviour was called stochastic in the past. In fact, it can be random-looking, i.e. "stochastic behaviour occurring in a deterministic system". Therefore, it is challenging to quantitatively describe the nature of chaotic dynamics and distinguish between the so-called random and chaotic motions from a time series, especially from an experimental time series. At present, chaotic time series analysis methods have been widely attention in fields of mathematics, physics, biology, biomedicine, robotics, geology, engineering, economics, finance, and so on.

Phase space reconstruction
Phase space reconstruction is generally the first step of chaotic time series analysis from a time series data. The dynamic characteristic of the system can be explored through phase space reconstruction of the original time series so that the mechanism of the original system can be revealed from the original time series [20]. It has been proved by the so-called Takens' embedding theorem [21]. According to the theorem, the reconstructed phase space can maintain the invariance of geometry for the original dynamical system [22], such as the characteristic value of the fixed point, the fractal dimension of the attractor, the Lyapunov exponent in the phase space orbit, and so on.
as two metric spaces. If the mapping are called as the isometric isomorphism.
a smooth function, it is a generic property that the map In terms of the above definitions and theorem, and the subspace are isometric isomorphic. Then the manifold M of dimension m can be embedded into . In other words, . For a practical time series } { t x , the state of the original system is equivalent to the m-dimensional manifold M. In fact, , ϕ is a smooth diffeomorphism. t y denotes the state of the system M at time t. d τ is the delay time. The signal observed in M at time t consist of . If the embedding dimension is greater than 2 times the dimension m of the attractor of the original system, the phase space with the base of the practical signal delay time coordinates is equivalent to the state space of the original system. That is, Takens' embedding theorem states that if the time series is indeed composed of scalar measurements of the state from a dynamical system, then under certain genericity assumptions, a one-to-one image of the original set {x} is given by the time-delay embedding, provided d is large enough. At present, the delay coordinate method has been widely used to give the phase space reconstruction from the original signal.
For a time series x(t) observed by the measure function h, i.e.
the vector t X can be constructed as follows, where d τ is an integer multiple of the sampling interval τ , called as the lag time or delay time. d is and embedding dimension, d≥2m+1. m is an attractor dimension of the original system.

Problems in phase space reconstruction
Takens' embedding theorem offers in the absence of noise, the possibility of reconstructing n-dimensional dynamics from one-dimensional infinite data of one observable-measurable system. This means that in the case of any delay time, a time series can always be embedded into the state space of the system, and when the embedding dimension is sufficiently large, reconstructed space and embedded space is almost one-to-one correspondence. Therefore, one can reconstruct a phase space from an experimental time series so as to estimate dynamical invariants of the time series, such as dimensions, Lyapunov exponents, entropies [21,23,24] and so on. However, the embedding theorem does not directly answer how to choose embedding dimension d and delay time t. In practical application, the experimental data is always limited and noisy so that the estimation of the above parameters presents some difficulties [25,26]. Accuracy of the phase space reconstruction is critically important to the estimation of invariant measures characterizing system behavior.
The choice of delay time d τ and embedding dimension d always has a great impact on the phase space reconstruction.
Some researchers have studied the choice of delay time d τ [26][27][28][29][30]. If the delay time d τ is too small, the reconstructed attractor will be crowded around the main diagonal, which is called as redundance. If d τ is too large, the dynamic shape of the attractor will be broken, which is called as irrelevance, and the phase space reconstruction is no longer representative of the true dynamics in the real system [28]. In normal circumstances, in order to make the elements of t Y independent, d τ is the same for all the embedding dimension d [27][28][29]. The autocorrelation function method and mutual information technique [30] have been most commonly used to give the delay time d τ although the issue of the delay time choice has still been completely resolved.
For the embedding dimension d, there are three methods that are usually used to choose the appropriate embedding dimension, including the correlation dimension, singular value decomposition(SVD), the false neighbors [21,31,32]. The correlation dimension method is to estimate appropriate dimension d in terms of the correlation theorem [8,21,33]. By increasing the embedding dimension, one notes an appropriate dimension d when the value of the correlation dimension stops changing. Broomhead and King [31] used the singular value decomposition (SVD) technique to determine an appropriate embedding dimension d directly from the raw time series. The false neighbor method is based on the fact that choosing a too low embedding dimension results in points, which are far apart in the original phase space, being moved closer together in the reconstruction space [32]. Besides, there are also some other methods and modified extensions developed based on the above methods. However, there are still problems on how to determine the appropriate embedding dimension from a scalar time series [34][35][36][37][38].

Correlation dimension
If a system is chaotic, the strange attractor in a region of the phase space constitutes an infinite hierarchy of self-similar structure, i.e. a fractal structure. One can use quantitative measures to define the fractal nature. The correlation dimension is a useful measurement. Grassberger and Proccacia give a kind of computation method, called GP algorithm [33,39,40].

GP algorithm of correlation dimension
Let X1, X2, ..., Xn be a point of the attractor in phase space. Cl(Xj) is denoted as a hypersurface sphere with the radius l at the reference point Xj.
Then a correlation integral function is defined as In practical computation, D2 is the slope of log C vs log l curve over a selected straight line range.

Correlation dimension theorem[41]
. A is an attractor of map G that has only a finite number of periodic points of period P. Under a natural probability measure μ , the correlation The theorem says that with the embedding dimension increasing, the slope of corresponding correlation integral curve will converge to the correlation dimension 2 D of the original system attractor. Therefore, the optimal embedding dimension can be estimated by using the correlation dimension 2 D . That is, if the embedding dimension , the slope of the correlation integral curve is equal to the correlation dimension. This also indicates that the dimension estimation actually does not have to meet the requirements of the embedding theorem on the embedding dimension . When the embedding , the reconstructed attractor can contain the fractal structural feature of the original system attractor to reflect the chaotic characteristics of the original system. Correlation dimension has been widely used in the analysis of chaotic time series.

Chaotic test based on correlation dimension
Chaos has a fractal structure so that the corresponding correlation dimension D2 is a fractional value. The estimation of correlation dimension D2 from a time series can be used to determine whether the time series is chaotic. If D2 is fractional, the original time series can have chaotic features, otherwise, it cannot be chaotic. According to the correlation dimension theorem, when the embedding dimension d of the reconstructed phase space is increased to a certain value, the correlation dimension D2 will be saturated. Then, the optimal embedding dimension d will be given from a time series. The corresponding correlation dimension D2 is called as the correlation dimension of this time series.
Lorenz chaotic time series is given by the state variable x of Lorenz system as follows.
The sampling points N=1000. For delay time τd=τ, the corresponding correlation dimension values are given in Table 1 when the embedding dimension d is increased from 2 to 12. From this table, we can see that the correlation dimension of the time series is about 2.07. The result shows that the reconstructed attractor has a fractal structure to reflect the chaotic feature of the system. The time series can reconstruct the state space of the original system when the embedding dimension d=6. The length N is 1000 points. Here,τd=1 (i.e. discrete time series interval). With increasing the embedding dimension d, the corresponding correlation dimension is 0.97 (see Table 2). The optimal embedding dimension is 2.

Logistic chaotic time series
For finite sampling number (e.g. N=1000), the reconstructed attractor will be broken when the embedding dimension d is increased continuously to a higher value. The estimation of correlation dimension will fail during computation. Therefore, embedding dimension d should not be unlimitedly increased.

The analysis of surface EMG signal based on correlation dimension
From the above analysis, we can see that the surface EMG signal has deterministic nonlinear component. Here, the correlation dimension is further used to study whether its nonlinear component are chaotic. Figure 10a shows a raw data for forearm pronation. Figure 10b gives the correlation integral curve of the data under the embedding dimension from 2 to 12. In  ± (see Table 3). The result indicates that the surface EMG signal during movement has fractal feature to reveal the implied chaotic motion behavior. Table 3 shows that the reconstructed attractor contains the nature of the raw system when the embedding dimension is over 6.

Study of surrogate data test method based on correlation dimension
The correlation dimension is a quantitative index that describes the fractal structure of chaotic attractor. It measures the freedom degree and complexity of the system. For the raw data and all data of φ F F ∈ , in the case of the same embedding dimension, the corresponding test results will be obviously significant [42]. Here, the correlation dimension is used as a test statistic to analyze the surface EMG signal. According to the null hypothesis 3, 39 sets of surrogate data are produced in confidence level 95%. In order to quickly obtain the correlation dimension in a variety of circumstances, the linear parts of all correlation integral curves are taken as the same. Though these values are not accurate, this test algorithm is great effective to test the chaotic fractal feature of the experimental data.
When the sampling interval τ=1, Figure 11a and b show the results of the surrogate data test analysis for Lorenz chaotic time series based on correlation dimension. There are significant differences between the original data and its surrogate data in m = 5 (see Fig. 11a). This result explains that the null hypothesis can be rejected in confidence level 95%. The a. * is D2 value of raw data in m = 5,  Figure 11. The surrogate data test analysis based on correlation dimension for Lorenz chaos time series by sampling intervals differences disappear between the raw data and its surrogate data in m = 10 (see Fig. 11b). This illustrates that the reconstructed attractor appears broken. The reconstructed phase space is similar to that of the surrogate data with linear stochastic noise characteristics. Figure 11c and d show the results with τ=0.1. The differences between the raw data and its surrogate data can be seen in Fig. 11c (m = 4). When m>2, these differences become larger as m increases. Figure 11e and f show the results with τ=0.005. Figure 11e shows the surrogate data test histogram in m = 2. The correlation dimension curves of the raw data and its surrogate data are given in m = 2 ~ 10 (see Fig. 11f). Even in the case of oversampling, the correlation dimension as test statistic can also make the surrogate data method very effective. Figure 12 shows the surrogate data analysis for the surface EMG signal in Fig. 12a based on correlation dimension. When m = 6, the correlation dimension value of the raw data is different from those of its surrogate data generated by the null hypothesis 3. The correlation dimension curves of the raw data and its surrogate data are given when m = 2 ~ 8 in Fig. 12b.

The surface EMG signal analysis based on the surrogate data and correlation dimension
We can see the differences between the original data and its surrogate data. The null hypothesis 3 can be rejected in confidence level 95%. The result indicates that the surface EMG signal has deterministic nonlinear components, even chaotic.
a. * is D2 of surface EMG in m = 6, histogram is D2 distribution of surrogate data, abscissa is D2, ordinate is histogram b. The curves of surface EMG and surrogate data with m, abscissa is m=2~8, ordinate is D2 Figure 12. The surrogate data test analysis based on correlation dimension for surface EMG signal.

Largest Lyapunov exponent
The Lyapunov exponent method is to directly identify whether a system is chaotic. If the system is chaotic, the Lyapunov exponent is positive. Otherwise, the Lyapunov exponent is negative. For this, the Lyapunov exponent can be used to test the chaotic feature of a signal under study. The first algorithms developed computed the whole Lyapunov spectrum by Wolf et al. [43] and Sano et al. [44]. Meanwhile, the largest Lyapunov exponent is sufficient for assessing the presence of chaos. At present, there are many algorithms to estimate the largest Lyapunov exponent from a time series, such as an algorithm given by Rosenstein et al. [45]. This algorithm is aimed specifically at estimating the largest Lyapunov exponent from short data.

Algorithm of largest Lyapunov exponent estimation[45]
For a short time series, Rosenstein et al. present a robust estimation algorithm of the largest Lyapunov exponent. First, the attractor is reconstructed, refer to Eq. 25. Next, the algorithm locates the closest neighbor of each point Xi on the trajectory, with respect to the Euclidian distance. Then, one defines the distance between two neighboring points at instant n=0 by: where • is the Euclidian norm. Here, the temporal separation of the nearest neighbors should be greater than the mean period of the time series.
According to time, the average distance between two neighboring vectors can be simply Assume that the system is controlled by the largest Lyapunov exponent only. Then, the distance between two neighbor points obey the following relationship: For , there is:  Then, the Lyapunov exponent can be given by using a least-squares fit to the "average" line: . The largest Lyapunov exponent λ is the slope of y(n) in the above equation.
This method is deduced directly from the largest Lyapunov exponent definition. The accurate evaluation of λ depends on the full use of the data. In practice, the curve y(n) will tend to saturation. The largest Lyapunov exponent λ is given by computing the slope of the linear part in the curve y(n).

Chaos test based on largest Lyapunov exponent
In general, if the signal is chaotic, the slope of the curve y(n) will be independent of the embedding dimension. Otherwise, if the signal is not chaotic, the slope of the curve y(n) will depend on the embedding dimension. When the embedding dimension m is chosen from 2 to 8, the Lyapunov exponent of the curve y(n) of the signal is shown in Figure 13. For a chaotic signal, a good illustration is given (see Figure 13a, b and c). The y(n) curves are different from those of a non-chaotic signal (compare with Figure 13d and e). However, even for chaotic signals, the y(n) curves are not always parallel. For example, in the case of undersampling ( 1 = τ ), the y(n) curves of Lorenz chaotic time series are similar to those of linear stochastic process(compare with Figure 13e and f). In the literature [23], the y(n) curves of the Ikeda chaotic time series are also not parallel. Figure 14 gives the curves of Lyapunove exponent y(n) for the surface EMG signal. The y(n) curves are not very parallel for the surface EMG signal. It is difficult to distinguish the curves of y(n) for the surface EMG signal from those of Figure 13d, f and g. The surface EMG signal can not be determined as chaotic, or as stochastic. But it can be a highdimensional system.

Principal component analysis
Broomhead and King [31] proposed the idea of singular system analysis that determines an appropriate embedding dimension d directly from the raw time series. It provides its convenience for the further analysis of the given system. Numerical experience, however, led several authors to express some doubts about reliability of singular system analysis in the attractor reconstruction [46][47][48]. Palus and Dvorak [37] explain why singular-value decomposition(SVD), the heart of the singular system analysis and by nature a linear method, may become misleading technique when it is used in nonlinear dynamics studies that reconstruction parameters are time-delay, embedding dimension (or embedding windows). For this, we propose a novel nonlinear analysis method based symplectic geometry, called symplectic principal component analysis(SPCA) [49].

Principle and algorithm of principal component analysis
Let a time series x1, x2,..., xn be the measured signal by sampling interval ts, n is the number of samples. According to Takens' embedding theorem, a trajectory matrix X can be given by time delay coordinates method, refer to Eq. 25(τd=1): where d is embedding dimension. m=n-d+1 is the number of points in d-dimension reconstruction attractor, , denotes a point in the attractor. For the matrix X, there are a m×d orthogonal matrix V and a d×d orthogonal matrix. The matrix X can be decomposed as follows.
where S is d×d diagonal matrix, whose elements are defined Since the matrix V is orthogonal, then ij ij Meanwhile, for the matrix U, there are In order to facilitate the calculation, Broomhead et al. applies the covariance matrix C of the matrix X to replace the matrix X. The details are as follows: Its values reflect the degree of correlation between the time delay coordinate variable i and j.
Let Y=XU, then: where U T CU is the covariance matrix of the matrix Y. Its elements are zero, except that the diagonal elements are equal to σi. This means that the variables i and j of the matrix Y are independent. The coordinate system is orthogonal, which is constituted from the variables of the matrix Y after the above transformed. The σi is called the principal component or In order to filter out noise, the trajectory matrix X is first projected into the coordinate system U.

XU Y =
(51) The variables in the matrix Y are independent. Then, the original coordinate system is updated by using the matrix Y: That is, X is a new time delay coordinate system. Besides, the new coordinate system corresponding to the principal axes can eliminate the noise floor to reduce the noise from the data. However, the truncated position of the principal components depends on the signal-noise-ratio, especially for the measurement noise. The principal components of the chaotic time series based on SVD spectrum more easily subject to the measurement noise so that the embedding dimension estimation is directly affected. For the smaller noise, there is the more number of principal components above the noise floor. For the larger noise, the number of the corresponding principal components will be reduced. Here, the above calculation accuracy is 2.2204e-016, which does not consider the numerical calculation error.

Influence of sampling interval on the principal component spectrum of chaotic time series[49]
The Lorenz chaotic system is considered to give the state variable x in order to study the influence of sampling interval on the principal component spectrum. The principal component spectrum slant and have no floor for the chaotic time series x with τ=0.005 (see Fig. 16a). When τ=0.1 (see Fig. 16b Fig. 16c). It shows that the distribution of the total energy has little difference in each principal axis, like the Gaussian noise (see Fig. 16d). For the Gaussian noise, its principal component spectrum curves are horizontal lines, where N=10000. It shows that every principal component is equal to each other. The energy distributes into every principal axis averagely. Therefore, it can be seen that sampling interval affects the determination of embedding dimension. When the sampling interval is not undersampling, the determination of embedding dimension depends the amount of signal-noise-ratio. In the case of undersampling, the chaotic time series is similar to noise so that the embedding dimension seems to be estimated as 1.

Symplectic principal component analysis
The symplectic geometry is a kind of phase space geometry. Its nature is nonlinear. It can describe the system structure, especially nonlinear structure, very well. It has been used to study various nonlinear dynamical systems [50][51][52] since Feng Kang [53] has proposed a symplectic algorithm for solving symplectic differential. However, from the view of data analysis, few literatures have employed symplectic geometry theory to explore the dynamics of the system. Our previous works have proposed the estimation of the embedding dimension based on symplectic geometry from a time series [49,[54][55][56]. Subsequently, Niu et al. have used our method to evaluate sprinter's surface EMG signals [57]. Xie et al [58] have proposed a kind of symplectic geometry spectra based on our work. Subsequently, we show that SPCA can well represent chaotic time series and reduce noise in chaotic data [59,60].
In SPCA, a fundamental step is to build the multidimensional structure (attractor) in symplectic geometry space. Here, in terms of Taken's embedding theorem, we first construct an attractor in phase space, i.e. the trajectory matrix X from a time series. That is, for a measured data (the observable of the system under study) x1, x2, ..., xn recorded with sampling interval ts, the corresponding d-dimension reconstruction attractor, Xm×d can be given (refer to Eq.40).Then we describe the symplectic principal component analysis (SPCA) based on symplectic geometry theory and give its corresponding algorithm.

Symplectic principal component method
SPCA is a kind of PCA approaches based on symplectic geometry. Its idea is to map the investigated complex system in symplectic space and elucidate the dominant features underlying the measured data. The first few larger components capture the main relationship between the variables in symplectic space. The remaining components are composed of the less important components or noise in the measured data. In symplectic space, the used geometry is called symplectic geometry. Different from Eulid geometry, symplectic geometry is the even dimensional geometry with a special symplectic structure. It is dependent on a bilinear antisymmetric nonsingular cross product--symplectic cross product: The measurement of symplectic space is area scale. In symplectic space, the length of arbitrary vectors always equals zero and without signification, and there is the concept of orthogonal cross-course. In symplectic geometry, the symplectic transform is the nonlinear transform in essence, which is also called canonical transform, since it has measure preserving characteristics and can keep the natural properties of the original data unchanged. It is fit for nonlinear dynamics systems.
The symplectic principal components are given by symplectic similar transform. It is similar to SVD-based PCA. The corresponding eigenvalues can be obtained by symplectic QR method. Here, we first construct the autocorrelation matrix Ad×d of the trajectory matrix Xm×d. Then the matrix A can be transformed as a Hamilton matrix M in symplectic space.
, then S is a symplectic matrix. For Hamilton matrix M，its eigenvalues can be given by symplectic similar transform and the primary 2d dimension space can be transformed into d dimension space to resolve [17][18][19] , as follows: ii. Construct a symplectic matrix Q, where B is up Hessenberg matrix (bij=0, i>j+1). The matrix Q may be a symplectic Household matrix H. If the matrix M is a real symmetry matrix, M can be considered as N.
Then one can get an upper Hessenberg matrix (referred to equ. 13), namely, iv. These eigenvalues are sorted by descending order, that is Thus the calculation of 2d dimension space is transformed into that of that of d dimension space. The μ is the symplectic principal component spectrums of A with relevant symplectic orthonormal bases. In the so-called noise floor, values of , , reflect the noise level in the data [49,55]. The corresponding matrix Q denotes symplectic eigenvectors of A.

Proposed algorithm of symplectic principal component method
For a measured data x1, x2, ..., xn, our proposed algorithm consists of the following steps: 1. Reconstruct the attractor Xm×d from the measured time series, where d is the embedding dimension of the matrix X, and m = n-d+1. 2. Remove the mean values Xmean of each row of the matrix X. 3. Build the real d×d symmetry matrix A, that is, Here, d should be larger than the dimension of the system in terms of Taken's embedding theorem. 8. For the noisy time series, the first estimation of data is usually not good. Here, one can go back to the step (6) and let Xi =Xs in Eq.(65) to do step (6) and (7) again. Generally, the second estimated data will be better than the first estimated data.
Besides, it is necessary to note that for the clean time series, the step (8) is unnecessary to handle.

Performance evaluation
SPCA, like PCA, can not only represent the original data by capturing the relationship between the variables, but also reduce the contribution of errors in the original data. Here, the performance analysis of SPCA is studied from the two views, i.e. representation of chaotic signals and noise reduction in chaotic signals.

Representation of chaotic signals
We first show that for the clean chaotic time series, SPCA can perfectly reconstruct the original data in a high-dimensional space. We first embed the original time series to a phase space. Considering the dimension of the Lorenz system(see Eq. 30) is 3, d of the matrix A is chosen as 8 in our SPCA analysis. To quantify the difference between the original data and the SPCA-filtered data, we employ the root-mean-square error (RMSE) as a measure: where ) (i x and ) ( i x are the original data and estimated data, respectively. When k = d, the RMSE values are lower than 10 -14 (see Figure 17). In Figure 17, the original data are generated by Eq. 30. The estimated data is obtained by SPCA with k=d. The results show that the SPCA method is better than the PCA. Since the real systems are usually unknown, it is necessary to study the effect of sampling time, data length, and noise to the SPCA approach. From the Figure 17 and 18, we can see that the sampling time and data length have less effect on SPCA method in the case of free-noise.  For analyzing noisy data, we use the percentage of principal components (PCs) to study the occupancy rate of each PC in order to reduce noise. The percentage of PCs is defined by % 100 where d is the embedding dimension, i μ is the i-th principal component value. From the Figure 19, we find that the first largest symplectic principal component (SPC) of the SPCA is a little larger than that of the PCA. It is almost possessed of all the proportion of the symplectic principal components. This shows that it is feasible for the SPCA to study the principal component analysis of time series.
Next, we study the reduced space spanned by a few largest symplectic principal components (SPCs) to estimate the chaotic Lorenz time series (see Fig. 20). In Figure 20, the data x is given with a sampling time of 0.01 from chaotic Lorenz system. The estimated data is calculated by the first three largest SPCs. The average error and standard deviation between the original data and the estimated data is -6.55e-16 and 1.03e-2, respectively. The estimated data is very close to the original data not only in time domain (see Figure 20a) but also in phase space (see Figure 20b). We further explore the effect of sampling time in different number of PCs. When the PCs number k =1 and k =7, respectively, the SPCA and PCA give the change of RMSE values with the sampling time in Figure 21. We can see that the RMSE values of the SPCA are smaller than those of the PCA. The sampling time has less impact on the SPCA than the PCA. In the case of k = 7, the data length has also less effect on the SPCA than the PCA(see Fig. 22).
Comparing with PCA, the results of SPCA are better in the above Figures. We can see that the SPCA method keep the essential dynamical character of the primary time series generated by chaotic continuous systems. These indicate that the SPCA can reflect intrinsic nonlinear characteristics of the original time series. Moreover, the SPCA can elucidate the dominant features underlying the observed data. This will help to retrieve dominant patterns from the noisy data. For this, we study the feasibility of the proposed algorithm to reduce noise by using the noisy chaotic Lorenz data.

Noise reduction in chaotic signals
For the noisy Lorenz data x, the phase diagrams of the noisy and clean data are given in Figure 23a and 23b. The clean data is the chaotic Lorenz data x with noise-free (see Eq. 30). The noisy data is the chaotic Lorenz data x with Gaussian white noise of zero mean and one variance (see Eq. 30). The sampling time is 0.01. The time delay L is 11 in Figure 23. It is obvious that noise is very strong. The first denoised data is obtained in terms of the proposed SPCA algorithm (see Figure 23c-f). Here, we first build an attractor X with the embedding dimension of 8. Then the transform matrix W is constructed when k=1. The first denoised data is generated by Eq. (65) and (66). In Figure 23c, the first denoised data is compared with the noisy Lorenz data x from the view of time field. Figure 23d shows the corresponding phase diagram of the first denoised data. Compared with Fig. 23a, the first denoised data can basically give the structure of the original system. In order to obtain better results, this denoised data is reduced noise again by the step (8). We can see that after the second noise reduction, the results are greatly improved in Fig. 23e and 23f, respectively. The curves of the second denoised data are better than those of the first denoised data whether in time domain or in phase space by contrast with Fig. 23c and 23d. Figure 23g shows that the PCA technique gives the first denoised result. We refer to our algorithm to deal with the first denoised data again by the PCA (see Figure 23h). Some of noise has been further reduced but the curve of PCA is not better than that of SPCA in Figure 23e. The reason is that the PCA is a linear method indeed. When nonlinear structures have to be considered, it can be misleading, especially in the case of a large sampling time (see Figure  24). The used program code of the PCA comes from the TISEAN tools (http://www.mpipksdresden.mpg.de/ ~tisean).  Figure 24 shows the variation of correlation dimension D2 with embedding dimension d in the sampling time of 0.1 for the clean, noisy, and denoised Lorenz data. We can observe that for the clean and SPCA denoised data, the trend of the curves tends to smooth in the vicinity of 2. For the noisy data, the trend of the curve is constantly increasing and has no platform. For the PCA denoised data, the trend of the curve is also increasing and trends to a platform with 2. However, this platform is smaller than that of SPCA. It is less effective than the SPCA algorithm. This indicates that it is difficult for the PCA to describe the nonlinear structure of a system, because the correlation dimension D2 manifests nonlinear properties of chaotic systems. Here, the correlation dimension D2 is estimated by the Grassberger-Procaccia's algorithm [33,40].

Estimation of embedding dimension based on symplectic geometry
In terms of Eq. 63, the values of μi, i=k+1, ..., d, are far smaller than μk. These values form a noise floor. Therefore, the embedding dimension of the reconstruction system can be determined by the noise floor. Here, the noise and nonlinear time series are used to investigate the feasibility of the embedding dimension estimation based on symplectic geometry.
For noise (which is generally regarded as Gauss white noise with mean value 0 and variance 1 in practical systems), symplectic geometry spectrums of this noise give the even distribution of its total energy (see Fig. 25a). From this figure, we can see that the symplectic geometry spectrums of noise can reflect the characteristic of noise very well when N=1000. This shows SG method can reflect noise level in the condition of short data length. For the time series of state variable x in Logistic chaos system without noise interference, the symplectic geometry spectrums (see Fig. 25b) are slant in the beginning then turn into plane area with the increase of index i. In other words, the distribution of total energy on the different axes is obviously different and with increasing the embedding dimension, the slants of symplectic geometry spectrums transit into noise floor. So one can determine embedding dimension from the number of symplectic geometry spectrums over noise floor, in which its determining criterion is similar to that in [37]. From Fig. 25b, the embedding dimension of Logistic chaotic time series can be estimated at 4 because the symplectic geometry spectrums begin to turn into noise floor at index 5. In a similar way, for Lorenz chaos time series without noise, when sampling interval τ=0.005, the embedding dimension can be estimated at 6 (see Fig. 25c).
Comparison of the results of our method (see Fig. 25b and 25c) and the results of SVD method (see Fig. 25d and 25e) shows that in SG method, the position of the noise floor is determined by the intrinsic dynamical structure of the nonlinear dynamic system rather than the numerical accuracy of the input data and the computation precision, but in SVD method, the noise floor was determined reversely [8,37,61].
In a word, the numerical experiments discuss that for the nonlinear dynamic systems, SG method can give the appropriate embedding dimension from their time series but SVD method cannot. So SG method is fit to deal with nonlinear systems.

Robustness of the embedding dimension estimation based on symplectic geometry
It is well known that the recent methods about embedding dimension are almost more or less subjective, or are affected by changes of the data length, noise, time lag, or sampling time, etc. Here, it is necessary that the robustness of the SG method is studied.

The effect of data length
In order to avoid the effect of the characteristics of the nonlinear system, this paper only considers and uses the noise to analyze the effect of data length. For Gauss white noise with mean value 0 and variance 1, when N=1000, the SG method can give better results than the SVD method (see Fig. 26a) because the total energy is distributed equably (see Fig. 25a).  And yet when N is rather large, e.g. N=10000, the SVD method can just have the similar results (see Fig. 26b) with Fig. 25a. These show that the SG method is more robust to changes of the data length than the SVD method. Then the SG method is fitter to the analysis of short time series. At present, there are many estimators of appropriate embedding dimension, but it has gradually been realized that such estimators are useful only for low-dimensional noise-free systems; such systems, however, seem hardly to occur in the real life. Therefore, this paper studies the robustness of the SG method under noise. For the signal obtained from the real system, it is always contaminated by noise (inner noise or/and outer noise). Although contaminated by inner or/and outer noise, the embedding dimension of Logistic system can always be noted at 4 by using the SG method because the noise floor begins at the embedding dimension 5 (see Fig. 27a and 27b). These show either inner noise or outer noise has little impact on the symplectic geometry spectrums. On the further increase of noise, the position of noise floor is obviously raised from the Figure 27c, but the appropriate embedding dimension 2 can still be obtained. In the similar way, for Lorenz chaos time series without and with noise, when sampling interval τ=0.005, the embedding dimension is 6 without noise and 3 with noise, respectively (see Fig. 25c and Fig. 27d). These results show that the SG method is useful for Lorenz system with noise, too. Meanwhile, we find that the SG method can obtain the results similar to nonlinear high singular spectrum algorithm [62]. Thus, it further shows that the SG method can reflect intrinsic nonlinear characteristics of the raw data.
a. The symplectic geometry spectrums,  For the changes of the sampling interval from τ=0.005 to τ=0.1, this paper finds that the embedding dimension can be estimated at 6 from the corresponding symplectic geometry spectrums of Lorenz chaos time series (see Fig. 25c and Fig. 28a), although the position of noise floor is constantly driven up. However, in the same condition, SVD method cannot give the appropriate embedding dimension (see Fig. 25e and 28b), the results of which are similar to the results of the literature [61]. Besides, no matter the sampling interval is over sampling or under sampling, SG method can always give the appropriate embedding dimension d of Lorenz chaos time series (see Fig. 28c and 28d) because the correlation dimension m of Lorenz system is 2.07, in general, if d>m, d is viable.

Analysis of the surface EMG signal based on symplectic geometry
For the action surface EMG signal (ASEMG) collected from a normal person, SVD method cannot give its appropriate embedding dimension (see Fig. 29a). The method based on correlation theory can do it but costs much time for computation. Here, SG method can fast obtain its embedding dimension. Figure 29b is the symplectic geometry spectrums of action surface EMG signal. The embedding dimension can be chosen as 6, which is the same as that of correlation dimension analysis [3]. This further shows that the SG method has stronger practicability for the small sets of experiment data.
a. The SVD principal component spectrums b. The symplectic geometry spectrums

Study of nonlinear dynamical systems based on multifractal theory
Fractal is a kind of geometry structures that have similarity in structure, form or function between the local and the whole. In nature, almost every object is very complex and performs a self-organization phenomenon that is a spatiotemporal structure or state phenomenon by forming spontaneously. From the view of geometry structure, this object has its own self-similarity properties in many parts, called a multifractal system. This structure can often be characterized by a set of coefficients, such as multifractal dimension, wavelet multifractal energy dimension. The multifractal theory reflects the complexity and richness of the nature in essence.

Self-affine fractal[63-65]
Definition 1 Let the mapping n n R R S → : . S is defined by where T is a linear transformation on R n . b is a vector in R n . Thus, S is a combination of a translation, rotation, dilation and, perhaps, a reflection, called an affine mapping. Unlike similarities, affine mappings contract with differing ratios in different directions.
Theorem 1 Consider the iterated function system given by the contractions{ for each i. Then there is a unique attractor F, i.e. a non-empty compact set such Moreover, if we define a transformation S on the class φ of non-empty compact sets by For E∈φ, and write S k for the kth iterate of S (so for every set E∈φ such that If an IFS consists of affine contractions { m S S , , 1  } on R n , the attractor F guaranteed by Theorem 1 is termed a self-affine set.
Since self-affine time series have a power-law dependence of the power-spectral density function on frequency, .self-affine time series exhibit long-range persistence. For a practical data, one can use the relationship of power spectrum and frequency to determine if the data has the self-affine fractal characteristic.

Spectrum analysis[65, 66]
Let a time series be . Its spectrum is given by where f is frequency. The power spectrum of f is defined by If the power spectrum obeys a power law for large f, the time series x(t) has the self-affine fractal characteristic. The self-affine fractal dimension D is given S(f) is plotted as a function of f with log-log scaling. β is the negative of the slope of the bestfit straight line in the range of large f. Note that the value of β is a measure of the strength of persistence in a time series. β>1 reflects strong persistence and nonstationary. 1>β>0 describes weak persistence and stationary. β=0 shows uncorrelated stationary. β<0 indicates antipersistence and stationary. In all cases, however, a self-affine time series with a non-zero β has long-range (as well as short-range) persistence and anti-persistence. For small β, the correlations with large lag are small but are non-zero. This can be contrasted with time series that are not self-affine; these may have only short-range persistence (either strong or weak).
Although the self-affine mapping are varied in a continuous way, the dimension of the selfaffine set need not change continuously. Unfortunately, the self-affine fractal situation is much more complicated. It is quite difficult to obtain a general formula for the dimension of self-affine sets. It is not enough that only one fractal dimension is used to describe the selfaffine fractal time series. The multifractal dimensions have been proposed to describe this kind of the time series [67][68][69][70][71][72].

Multifractal dimension
For a measured time series of a multifractal system, its trajectory in phase space is often attracted to a bounded fractal object called strange attractor for which a whole set of dimension Dq has been introduced which generalize the concept of the Hausdorff dimension. Let X1, ..., Xn be a point of the attractor in the phase space. The probability that the trajectory point is found within a ball of radius l around one of the inhomogeneously distributed points of the trajectory is denoted by The q-order correlation integral is defined by The multifractal dimension Dq can be computed by the following equation: The above Dq is the multifractal dimension method based on Grassberger and Procaccia. The generalized correlation integral ) (l C q which can be obtained from an experimental time series yields in a plot ) ( ln l C q vs lnl straight lines with slopes Dq. For q=0, the D0 is called the topological dimension, fractal dimension or capacity dimension. For q=1, the D1 is called the information dimension. For q=2, the D2 is called correlation dimension. The function Dq is monotonically decreasing with q and gives information about the inhomogeneity of the attractor. For simple fractals, called monofractals, such as a homogeneous attractor, the multifractal dimension Dq is constant. In the general case of multifractal objects, the values of Dq monotonically decrease as q increase [67]. The shape of the Dq can be considered a criterion confirming that the object is a nonuniform fractal. Furthermore, it can be determined if the object is a nonlinear, complex structure by using the multifractal dimension of the signal.

Analysis of surface EMG signal based on multifractal dimension
The surface EMG signal is a complicated physiological signal. Its distribution is clearly uneven (see Figure 30). When the surface EMG signal is studied by using the fractal method, one should first determine if the surface EMG signal is fractal. Then, its corresponding fractal dimension D can be estimated by Eq.(77) under a certain resolution. Figure 30 shows the self-affine fractal analysis of the surface EMG signals from Channel 1 during finger flexion, finger tension, forearm pronation and forearm supination (the results of Channel 2 are similar to those of Channel 1). It can be seen that the surface EMG signals have selfaffine fractal characteristics. The results explain the physiological mechanism of the surface EMG signals.
In view of self-affine fractal characteristics, only one single fractal dimension is not easy to characterize the dynamics of surface EMG signals for different actions (see Table 4). There is little difference for the self-affine fractal dimensions of the four actions, where each type of action signals was chosen 100 sets of the data. The data length is 1000 points. In other words, it is difficult to identify the surface EMG signals of the different actions by using a single fractal dimension. The multifractal dimension values should be used to describe the action surface EMG signals during the arm movements.   Here, we use the above multifractal dimension theory to analyze the action surface EMG signals. For the surface EMG signals of the four actions in channel 1, the multifractal analysis results are shown in Figure 31. The results of channel 2 are omitted since they are similar to those of channel 1. In the figure, the Dq-q curves are calculated under q =8, 7, 6, 5, 4, 3, 2, 1, 0, -1, -2, -3, -4, -5, -6, -7, -8. It can be seen that the Dq-q curves have a certain range.
The results indicate that the surface EMG signals are non-uniform fractal structure signals. These are consistent with the results of the above self-affine fractal analysis. The parameter values with q can be used to classify the data. In theory, it will be more reasonable that multifractal dimensions are used to describe the surface EMG signals. However, the actual calculation process of the multifractal dimensions is very time-consuming. For the surface EMG signals, it is extremely difficult to meet the requirements of real-time classification.
a. The multi-fractal curves of EMG signal during finger flexion; abscissa is ln(l), ordinate is ln(Cq(l)) b. The multi-fractal dimensions of curves in Fig. a; abscissa is q, ordinate is Dq c. The multi-fractal curves of EMG signal during finger tension; abscissa is ln(l), ordinate is ln(Cq(l)) d. The multi-fractal dimensions of curves in Fig. c; abscissa is q, ordinate is Dq e. The multi-fractal curves of EMG signal during forearm pronation; abscissa is ln(l), ordinate is ln(Cq(l)) f. The multi-fractal dimensions of curves in Fig. e; abscissa is q, ordinate is Dq g. The multi-fractal curves of EMG signal during forearm pronation; abscissa is ln(l), ordinate is ln(Cq(l)) h. The multi-fractal dimensions of curves in Fig. e; abscissa is q, ordinate is Dq Figure 31. The multi-fractal analysis of surface EMG signals during movements

Conclusion and future research
In order to investigate whether the essence of the surface EMG signal is stochastic or deterministic nonlinear (even chaotic), some emerging nonlinear time series analysis approaches are discussed in this chapter. These techniques are based on detecting and describing determinisitic structure in the signal, such as surrogate data method, VWK model method, chaotic analysis method, symplectic geometry method, fractal analysis method, and so on.
The surrogate data method and VWK model mehtod are used to detect the surface EMG signal for arm movement and muscle fatigue. The results show that the surface EMG signal has deterministic nonlinear components. Moreover, our algorithm of surrogate data based on the null hypothesis 3 is proved that can completely satisfy the requirment of the null hypothesis 3. The VWK method with surrogate data can illuminate that not only the action but also fatigue surface EMG signals contain nonlinear dynamic properties.
Chaotic analysis techniques are reviewed and applied to investigate the surface EMG signals. The results show that the surface EMG signals have high-dimension chaotic dynamics by using correlation dimension and largest Lyapunov expoent techniques. For the estimation of embedding dimension, symplectic principal component analysis method is introduced and discussed. In comparison with correlation dimension algorithm and SVD analysis, symplectic geometry analysis is both very simple and reliable. The results show that symplectic geometry method is useful for determining of the system attractor from the experiment data.
The fractal theory is applied to study the fractal feature of the action surface EMG signal collected from forearm of normal person. The results show that he action surface EMG signal possesses the self-affine fractal characteristic. So, it is difficult to describe the surface EMG signals by using a single fractal dimension. The multifractal dimensions are used to analyze the action surface EMG signals during the arm movements. The results indicate that the surface EMG signals are non-uniform fractal structure signals. The multifractal dimension values can be used to identify the surface EMG signals for different movements.
For the nonlinear characteristics of the surface EMG signals, chaos and fractal theories will play the leading role in the nonlinear study of the surface EMG signals. The related methods need to be further researched and developed although these techniques have been applied to analyze the surface EMG signals. It provides a new way for the study of the quantitative analysis of physiology and pathology, sports medicine, clinical medical diagnostics and bionics of robot limb motion.   Using H (2) , the second column of A (2) can be changed to all zero vector except the first and second elements, namely:

Apendix
Householder matrix H can be obtained by repeating above mentioned method until A (n) becomes an upper triangle matrix: