Open access peer-reviewed chapter

Spatial-Temporal Data Analysis in Nonlinear System

Written By

Xing He and Minyu Chen

Submitted: 31 May 2022 Reviewed: 05 June 2022 Published: 09 July 2022

DOI: 10.5772/intechopen.105709

From the Edited Volume

Nonlinear Systems - Recent Developments and Advances

Edited by Bo Yang and Dušan Stipanović

Chapter metrics overview

119 Chapter Downloads

View Full Metrics

Abstract

Spatial-temporal analysis is at the heart of data mining in Big Data Era. Most mathematical tools are incompetent to deal with spatial-temporal data. This phenomenon has greatly spurred the development of data science, especially in the field of BDA (big data analytics). This chapter proposes random matrix theory (RMT) to handle this problem, which begins by modeling spatial-temporal datasets as sequences, whose term is in the form of a random matrix each. Then, some fundamental RMT principles are briefly discussed, such as asymptotic spectrum laws, transforms, convergence rate, and free probability, in order to extract high-dimensional statistics from the random matrix as the indicators. The statistical properties of these indicators are discussed for a better understanding of the system. Finally, some potential application fields are given.

Keywords

  • spatial-temporal data
  • electric power system
  • data-driven
  • random matrix theory
  • situation awareness
  • big data analyzation

1. Introduction

Electric power system reliability and intelligent management are critical to our daily living. Engineers and academics have recently focused on the use of large-scale phase measuring units (PMUs) to improve wide-area monitoring, protection, and control [1, 2, 3, 4, 5].

Most existing algorithms in power grid are model-based, which are built upon mechanism assumptions/simplifications and linear system control theory, with a determined and typically analytic outcome. These models, however, are ineffective for today’s power system, which is of ever-increasing complexity and uncertainty [6, 7, 8, 9, 10]: 1) Interconnection of nearby utilities may frequently improve overall safety and efficiency, resulting in a huge interconnected system, such as the North American Power Grid, which serves almost 400 million customers throughout the continent [11]; 2) the continuous penetration of cell units (e.g., distributed generations) that are small-size, large-number, distributed-deployment, diverse-behaviors, smart-response, and uncertain-control [12]. 3) physical disciplines (mechanics, magnetism, electric, and electronics) of a system are closely intertwined, especially in a CHP (combined heat and power) system or even IES (integrated energy system) [13]; and 4) the construction of energy foundation for a smart city. Those above characteristics are advantageous to an open, flat, nonlinear, high-uncertainty, and distributed EIoT, as shown in Figure 1 [14]. For such an EIoT, a precise mechanistic model or even a proper descriptive representation can hardly be formulated, let alone model-based linear mode.

Figure 1.

Diagram of future energy internet of thing: its resource flow, data flow, and participants [14].

Furthermore, engineering data, such as sampling data in a power system, is not similar to image data. Various sensors, such as phasor measuring units (PMUs), are used to sample data from the grid. The huge dataset is in a high-dimensional vector space and in time series: the temporal variations (T sampling instants) and spatial fluctuations (N grid nodes) are recorded concurrently, and hence it is called spatial-temporal data.

Most mathematical tools are incompetent in this task [15]. Facing the above spatial-temporal data, we can hardly extract statistical information, particularly spatial-temporal correlations; the high-dimensional structure does not match the requirements of most traditional mathematical methods. Also, this task is incompatible with supervised training algorithms such as neural networks, due to the lack of or asymmetry of massive labeled data [16].

Fortunately, random matrix theory (RMT), by unifying time and space through their ratio c = T/N, can strictly and mathematically deal with such data. Moreover, linear eigenvalue statistics (LESs) built from data matrices follow Gaussian distributions for very general conditions, and other statistical variables are studied due to the latest breakthroughs in probability on the central limit theorems of those LESs.

Advertisement

2. Spatial−Temporal data analyzation mode, tools, theory, and applications in electric power system

2.1 Big data era, fourth paradigm, and data-driven model

The world’s science has altered, as seen in Figure 2 [17]. Initially, there was only experimental science, followed by theoretical science, which included Newton’s Laws, Maxwell’s equations, and so on. The theoretical models became too hard to solve analytically for many issues, and people had to start simulating. These simulations have carried us through much of the previous millennia. People nowadays are collecting data through intensive sensors or simulations. The data flood has an impact on experimental, theoretical, and computational science, and several state-of-the-art data technologies and data sciences have converged to provide tremendous promise for data-intensive scientific discovery, the so-called Fourth Paradigm.

Figure 2.

Science paradigms and fourth paradigm [17].

Data-driven becomes a natural and stressful topic in energy systems, as evidenced by IEEE TRANSACTIONS ON SMART GRID special issue “big data analytics for grid modernization” published in 2016 [18]. Data-driven approaches are also characterized as model-free; we no longer rely significantly on physical models, and hence can manage instances where physical parameters are incorrect or even totally unavailable. Data-driven mode enables a quick start to our task, especially for a modern energy system in which the behaviors and discipline of system cell units are strongly coupled.

2.2 Basic of spatial−Temporal data, and high-dimensional information

Spatial–temporal data analysis means that we simultaneously deal with a large number of variables (in N-dimensional spatial space), and each variable (i = 1, , N) samples time series for a duration (in T-dimensional temporal space). A classical statistic theory treats fixed N only (typically N < 6 [3]), e.g., for ABC-dq0 transformation N = 3. This fixed small N is called the low-dimensional regime. In practice, we are interested in the case that N can vary arbitrarily in size compared with T (typically T > 60, N > 20, and c = N/T > 0 [15]). This fundamental difference is the primary motivation for studying BDA.

Spatial–temporal data mining is expected to contribute some (high-dimensional) information with domain-specific meaning attached as the supplement to DT-based situation awareness (SA). High-dimensional indicators (outputs of high-dimensional statistics) and deep features (outputs of deep learning) are two main types of representation of high-dimensional information.

2.3 Spatial-temporal data utilization architecture and tools

Most mathematical methods struggle to extract information from spatial-temporal data. This phenomenon has accelerated the development of data science, particularly in the field of AI and BDA. We describe a high-focus technique for each field: 1) DL (Deep learning), which is good at massive data modeling in AI [19] and 2) High-dimensional statistics, or more precisely, RMT, which does well in data analytics in BDA. Both tools use a set of (high-dimensional) methodologies for integrated spatial-temporal modeling and analysis, and they have already made profound impacts on many domains. Figure 3 depicts the architecture of large data mining based on DL and RMT.

Figure 3.

Architecture of spatial-temporal data utilization.

2.3.1 Deep learning and its advantages

DL is a cutting-edge data mining algorithm. As demonstrated in Figure 4, deep characteristics are learned at some level from comparatively hidden features in the hierarchy [20]. DL uses the enormous data in a non-handcrafted approach to create a deep (nonlinear) network model. A typical ANN (Artificial Neural Network) network is modeled as

Figure 4.

A typical ANN structure.

y=fxfLWLf2W2f1W1x+b1+b2+bLE1

The above DL network model can be built with little prior knowledge relevant to the physical mechanism or causal relationship. As a result, DL may be used in a variety of situations or even systems without major changes. For example, we use CNN (convolutional neural networks) for computer version system modeling [21], LSTM (long short-term memory) for prediction [22], and deep reinforcement learning for strategy optimization [23].

In a complicated system, DL has a competitive edge in terms of possible data use. Furthermore, the test error might be used to quantify the DL model’s performance on the generalization task, ensuring its usefulness in a real-world situation.

DL holds a competitive advantage for feasible data utilization in a complex system. In addition, the performance of the DL model on generalization task could be quantitatively evaluated by the test error, ensuring its usefulness in a real-world situation.

2.3.2 Big data analytics and RMT and the advantages

BDA uses spatial-temporal joint analysis to acquire high-dimensional statistics. Matrix-based variables, such as eigenvalue or the matrix variate itself, are likely to provide some insight to BDA [24]. These matrix-based variables are the variables of the N × T (large-dimensional) spatial-temporal data matrix that have an intrinsic statistical link, whether causal or not. These matrix-based variables are analytically intractable due to their high dimensionality rather than their big size. RMT is inextricably linked to this issue.

RMT understands the joint eigenvalue distribution as the statistic analytics in the asymptotic regime. In particular, by unifying time and space through their ratio c = T/N, BDA is acquired as the functionals of the eigenvalue distributions. For example, the matrix’s LES indicators [25] follow Gaussian distributions for very general conditions. Furthermore, many LES-derived variables, whose statistical properties are mostly derivable and provable, are studied due to the latest breakthroughs in high-dimensional probability [15]. In this sense, RMT is rigorous and fundamental in nature. Besides, RMT performs well with only moderate-size (unlabeled) data, which is often true for a domain-specific problem in EIoT.

2.4 Random matrix theory in a nutshell

2.4.1 RMT and its universality principle

Two ensembles, Gaussian unitary ensemble (GUE) and Laguerre unitary ensemble (LUE), are studied first in RMT [10]:

Γ=12R+RH,RN×N,GUE;1TRRH,RN×T,LUE.E2

where R is i.i.d. standard Gaussian random matrix.

We investigate the rate of convergence of the expected empirical spectral distribution (ESD) of Γ. Let hΓ (x) denotes the true eigenvalue density. Wigner’s Semicircle Law and Wishart’s M-P Law, respectively, for GUE and LUE, say that

hΓx=12π4x2,x22,GUE12πcxxa1xa2,xa1a2,LUEE3

where a1=1c2,a2=1+c2

Universality principle enables us to perform hypothesis tests under the assumption that the matrix entries are not Gaussian distributed but use the same test statistics as in Gaussian case. Numerous studies using field data [25, 26] demonstrate that M-P Law is universally valid with moderate matrix sizes, such as tens. This is the very reason why RMT is widely used in engineering.

2.4.2 Linear eigenvalue statistics and its properties

Consider a random matrix Γ∈RN × T, and M is the covariance matrix M = 1/TΓΓH. The LES τ of Γ is defined in [27].

τφ=i=1Nφλi=TrφM,E4

Law of Large Numbers tells us that N−1τϕ converges in probability to the limit

limn1Ni=1Nφλi=φλρλdλE5

where ρ(λ) is the probability density function, which is given in Eq. (4). Therefore, we deduce that

τφ=i=1Nφλi=TrφM=NφλρλdλE6

The Central Limit Theorem (CLT) for LES is studied as the natural second step:

σ2τφ=2cπ2π2<θ1,θ2<π2ψ2θ1θ21sinθ1sinθ2dθ1dθ2+κ4π2π2π2φζθsinθ2E7

See [25] for details.

2.4.3 LES-based hypothesis testing for random matrix

LES τ, as a positive scalar random variable defined in Eq. (5), is studied instead of the probability distribution of eigenvalues in Eq. (4). It can be viewed as a mathematically rigorous dimensionality reduction—the N × T random matrix is reduced to a positive scalar random variable.

As N → ∞, the asymptotic limit of LES τ expectation and variance, i.e., E(τ) and σ2 (τ), is given, respectively, in Eqs. (7) and (8). These two equations are sufficient to study the scalar random variable τ. Universal principle, as well as engineering experiences, demonstrate that moderate values of N and T are accurate enough for our practical purposes. LES τ is robust against data flaw and insusceptible to noises [10]. All of these statistical properties make LES a good SA indicator.

2.5 High-dimensional situation awareness indicator system and its properties

According to Eq. (5), numerous LESs can be designed from a certain spatial-temporal data Γ. Similarly, other high-dimensional indicators, for instance, statistic indicators, deep features, and electrical features, are calculable as the outputs of data tools according to Figure 4. They are tied together to provide an insight into domain-specific SA criteria for detection, prediction, etc. The details about the high-dimensional SA indicator system and its successful application cases can be found in ref. [15].

With these indicators, the high-dimensional indicator system is built; it supplies a multiple view angle to gain insight into the system. Aiming to provide a domain-specific SA task, the test function ϕ plays a role as a flexible filter depending on our task. Table 1 lists the properties of LES indicators and makes a comparison with classical ones.

High-dimensional (LES) IndicatorsClassical indicators
Data-driven(Mechanism) model based
Supported by data scienceSupported by physical laws or experience
Maybe unclearly defined in engineeringClearly defined
Often probabilistic valueOften determined one
Often in high dimensionsOften in low dimensions
Able to harness the spatial-temporal data flexiblyOnly a few data are available
Robust against bad data and insensitive to data selectionsSusceptible to data selections (usually a single measurement at a time slice)
Pure statistical procedureSystem errors are inevitable
Naturally coupling/decoupling for data blockCoupling/decoupling based on assumptions and simplifications
Random errors can be estimated with the model size (N, T)Errors accumulation are inevitable and difficult to evaluate

Table 1.

High-dimensional indicator system for EIoT.

Table 1 tells that LES provides a better indicator system in the 4th Paradigm. The relation of the LES indicators to the classical ones, in some sense, is just like that of quantum physics to the classical one. Comparing experimental values with ideal theoretical values, LES conducts SA in a complex system statistically.

In short, RMT supplies us with a data-driven approach to indicator extraction for the informatization of a real system via sampling spatial-temporal data. A cluster of statistical indicators, via a mathematical procedure, is formed as a new epistemology for the system. Some advantages—such as data-driven and model-free mode, theoretic guided, fast in speed, reasonableness, sensitivity, flexibility, and robustness against bad data—have already been shown in our previous work [10, 17].

2.6 LES-based hypothesis testing for random matrix

To study the convergence as a function of N, we study LES instead of the probability distribution of eigenvalues in Eq. (4). For an arbitrary test function with enough smoothness, LES τ (see it as a random variable Y) is a positive scalar random variable defined in Eq. (5). As N → ∞, the asymptotic limit of its expectation, E(Y), is given in Eq. (6), and the asymptotic limit of its variance, σ2 (Y), is given in Eq. (7). These two equations are enough to study the scalar random variable Y. This approach can be viewed as a dimensionality reduction—the random data matrix of size N × T is reduced to a positive scalar random variable Y! This dimension reduction is mathematically rigorous only when N, T → ∞, but N/T → c. Experiences demonstrate, however, that moderate values of N and T are accurate enough for our practical purposes. Moreover, our previous work shows that LES is robust against data errors (e.g., data loss, data out-of-synchronization) and insusceptible to (independent) random noises (not limited to white noises), which is not true to those low dimensional statistics, such as mean and variance of any single variable. All these statistical properties make LES a good matrix-based variable for a hypothesis testing design aiming to provide anomaly detection task.

We formulate the hypothesis testing in terms of the statistical properties of LES. Referring to the Gaussian property and standard scores, the detection is modeled as a binary hypothesis testing: the normal hypothesis H0 (no anomaly present) and the abnormal one H1, denoted by:

0:τφEτφστφ<ϵ,1:τφEτφστφϵ,E8

where ϵ is a threshold value that needs to be preset—e.g., at a significance level of 0.05, the ϵ should be set at 1.96.

Advertisement

3. Conclusion

This chapter, motivated for the future’s electrical grid, studies the nonlinear analysis based on RMT. Three ingredients are discussed in detail: 1) data modeling—modeling the spatial-temporal data as a sequence of random matrices, which are naturally connected to RMT. 2) data analytics—conducting high-dimensional analysis to obtain the statistical indicators. 3) interpretation—interpreting the indicator by studying its properties for a better understanding of the system.

The experimental indicators, which are fully derived from the sampling data, are applicable to various engineering functions. For example, by comparing the LESs with their theoretical prediction, anomaly detection can be implemented.

Future research directions include: (1) Model validation with different implementations of the grid, ranging from statistic, dynamic and real-world systems; (2) Data fusion with a number of random data matrices, using mathematical tools such as free probability; and (3) The use of Gaussian random matrices in replacement for general data matrices that are obtained from the electrical grid. The universality principle of RMT says that this replacement causes negligible errors.

Advertisement

Acknowledgments

This work is supported by the National Natural Science Foundation of China (Grant No. 51907121).

References

  1. 1. Chakrabarti S, Kyriakides E, Bi T, Cai D, Terzija V. Measurements get together. IEEE Power and Energy Magazine. 2009;7(1):41-49
  2. 2. Luo L, Bei H, Chen J, Sheng G, Jiang X. Partial discharge detection and recognition in random matrix theory paradigm. IEEE Access. 2016;PP(99):1-1
  3. 3. Lei C, Qiu RC, Xing H, Ling Z, Liu Y. Massive streaming pmu data model- ing and analytics in smart grid state evaluation based on multiple high-dimensional covariance tests. IEEE Transactions on Big Data. 2018;4(1):2332-7790
  4. 4. Hou W, Ning Z, Lei G, Xu Z. Temporal, functional and spatial big data computing framework for large-scale smart grid. IEEE Transactions on Emerging Topics in Computing. 2017;PP(99):1-1
  5. 5. Tu C, Xi H, Shuai Z, Fei J. Big data issues in smart grid – A review. Renewable and Sustainable Energy Reviews. 2017;79:1099-1107
  6. 6. Shaker H, Zareipour H, Wood D. A data-driven approach for estimating the power generation of invisible solar sites. IEEE Transactions on Smart Grid. 2016;7(5):2466-2476
  7. 7. Motter AE, Myers SA, Anghel M, Nishikawa T. Spontaneous synchrony in power-grid networks. Nature Physics. 2013;9(3):191-197
  8. 8. Wang L, Li HW, Wu CT. Stability analysis of an integrated offshore wind and seashore wave farm fed to a power grid using a unified power flow controller. IEEE Transactions on Power Systems. 2013;28(3):2211-2221
  9. 9. Xu X, He X, Ai Q, Qiu RC. A correlation analysis method for power systems based on random matrix theory. IEEE Transcations on Smart Grids. 2017;8(4):1811-1820
  10. 10. He X, Ai Q, Qiu C, Huang W, Piao L, Liu H. A big data architecture design for smart grids based on random matrix theory. IEEE Transactions on Smart Grid. 2017;8(2):674-686
  11. 11. Transmission, Office Of Electric Grid 2030: A National Vision for electricity’s Second 100 Years. Washington, DC. Office of Electric Transmission & Distribution. 2003
  12. 12. Yang B, Yu T, Shu H, Dong J, Jiang L. Robust sliding-mode control of wind energy conversion systems for optimal power extraction via nonlinear perturbation observers. Applied Energy. 2018;210:711-723
  13. 13. Fu X, Sun H, Guo Q, Pan Z, Xiong W, Wang L. Uncertainty analysis of an integrated energy system based on information theory. Energy. 2017;122:649-662
  14. 14. Guo J. The evolution of power system characteristics and related thinking. In: 2nd “Clean Energy Development and Consumption Symposium”. Xi’an, China: Chinese Society for Electrical Engineering; 2019
  15. 15. Qiu R, Antonik P. Smart Grid and Big Data. New York: John Wiley and Sons; 2015
  16. 16. Cheng L, Yu T. A new generation of ai: A review and perspective on machine learning technologies applied to smart energy and electric power systems. International Journal of Energy Research. 2019;43(6):1928-1973
  17. 17. Hey AJ, Tansley S, Tolle KM, et al. The Fourth Paradigm: Data-Intensive Scientific Discovery. Vol. 1. WA: Microsoft Research Redmond; 2009
  18. 18. Hong T, Chen C, Huang J, Lu N, Xie L, and Zareipour H. “Guest editorial big data analytics for grid modernization”. IEEE Transactions on Smart Grid. Sept 2016;7(5):2395–2396
  19. 19. Najafabadi MM, Villanustre F, Khoshgoftaar TM, Seliya N, Wald R, Muharemagic E. Deep learning applications and challenges in big data analytics. Journal of Big Data. 2015;2(1):1
  20. 20. Ren Y, Zhang L, Suganthan PN. Ensemble classification and regression-recent developments, applicationsand future directions. IEEE Computational Intelligence Magazine. 2016;11(1):41-53
  21. 21. Ling Z, Zhang D, Qiu RC, Jin Z, Zhang Y, He X, et al. An accurate and real-time method of self-blast glass insulator location based on faster r-cnn and u-net with aerial images. CSEE Journal of Power and Energy Systems. 2019;5(4):474-482
  22. 22. Kong W, Dong ZY, Jia Y, Hill DJ, Xu Y, Zhang Y. Short-term residential load forecasting based on lstm recurrent neural network. IEEE Transactions on Smart Grid. 2017;10(1):841-851
  23. 23. Zhang Z, Zhang D, Qiu RC. Deep reinforcement learning for power system applications: An overview. CSEE Journal of Power and Energy Systems. 2019;6(1):213-225
  24. 24. Adhikari S. Matrix variate distributions for probabilistic structural dynamics. AIAA Journal. 2007;45(7):1748-1762
  25. 25. Shcherbina M. Central limit theorem for linear eigenvalue statistics of the wigner and sample covariance random matrices. Journal of Mathematical Physics, Analysis, Geometry. 2011;7(2):176-192.
  26. 26. He X, Qiu RC, Ai Q, Chu L, Xu X, Ling Z. Designing for situation awareness of future power grids: An indicator system based on linear eigenvalue statistics of large random matrices. IEEE Access. 2016;4:3557-3568
  27. 27. Lytova A, Pastur L, et al. Central limit theorem for linear eigenvalue statistics of random matrices with independent entries. The Annals of Probability. 2009;37(5):1778-1840

Written By

Xing He and Minyu Chen

Submitted: 31 May 2022 Reviewed: 05 June 2022 Published: 09 July 2022