InTechOpen uses cookies to offer you the best online experience. By continuing to use our site, you agree to our Privacy Policy.

Mathematics » "Linear Algebra - Theorems and Applications", book edited by Hassan Abid Yasser, ISBN 978-953-51-0669-2, Published: July 11, 2012 under CC BY 3.0 license. © The Author(s).

Chapter 6

Identification of Linear, Discrete-Time Filters via Realization

By Daniel N. Miller and Raymond A. de Callafon
DOI: 10.5772/48293

Article top

Identification of Linear, Discrete-Time Filters via Realization

Daniel N. Miller1 and Raymond A. de Callafon1

1. Introduction

The realization of a discrete-time, linear, time-invariant (LTI) filter from its impulse response provides insight into the role of linear algebra in the analysis of both dynamical systems and rational functions. For an LTI filter, a sequence of output data measured over some finite period of time may be expressed as the linear combination of the past input and the input measured over that same period. For a finite-dimensional LTI filter, the mapping from past input to future output is a finite-rank linear operator, and the effect of past input, that is, the memory of the system, may be represented as a finite-dimensional vector. This vector is the state of the system.

The central idea of realization theory is to first identify the mapping from past input to future output and to then factor it into two parts: a map from the input to the state and another from the state to the output. This factorization guarantees that the resulting system representation is both casual and finite-dimensional; thus it can be physically constructed, or realized.

System identification is the science of constructing dynamic models from experimentally measured data. Realization-based identification methods construct models by estimating the mapping from past input to future output based on this measured data. The non-deterministic nature of the estimation process causes this mapping to have an arbitrarily large rank, and so a rank-reduction step is required to factor the mapping into a suitable state-space model. Both these steps must be carefully considered to guarantee unbiased estimates of dynamic systems.

The foundations of realization theory are primarily due to Kalman and first appear in the landmark paper of [1], though the problem is not defined explicitly until [2], which also coins the term “realization” as being the a state-space model of a linear system constructed from an experimentally measured impulse response. It was [3] that introduced the structured-matrix approach now synonymous with the term “realization theory” by re-interpreting a theorem originally due to [4] in a state-space LTI system framework.

Although Kalman's original definition of “realization” implied an identification problem, it was not until [5] proposed rank-reduction by means of the singular-value decomposition that Ho's method became feasible for use with non-deterministic data sets. The combination of Ho's method and the singular-value decomposition was finally generalized to use with experimentally measured data by Kung in [6].

With the arrival of Kung's method came the birth of what is now known as the field of subspace identification methods. These methods use structured matrices of arbitrary input and output data to estimate a state-sequence from the system. The system is then identified from the propagation of the state over time. While many subspace methods exist, the most popular are the Multivariable Output-Error State Space (MOESP) family, due to [7], and the Numerical Algorithms for Subspace State-Space System Identification (N4SID) family, due to [8]. Related to subspace methods is the Eigensystem Realization Algorithm [9], which applies Kung's algorithm to impulse-response estimates, which are typically estimated through an Observer/Kalman Filter Identification (OKID) algorithm [10].

This chapter presents the central theory behind realization-based system identification in a chronological context, beginning with Kronecker's theorem, proceeding through the work of Kalman and Kung, and presenting a generalization of the procedure to arbitrary sets of data. This journey provides an interesting perspective on the original role of linear algebra in the analysis of rational functions and highlights the similarities of the different representations of LTI filters. Realization theory is a diverse field that connects many tools of linear algebra, including structured matrices, the QR-decomposition, the singular-value decomposition, and linear least-squares problems.

2. Transfer-function representations

We begin by reviewing some properties of discrete-time linear filters, focusing on the role of infinite series expansions in analyzing the properties of rational functions. The reconstruction of a transfer function from an infinite impulse response is equivalent to the reconstruction of a rational function from its Laurent series expansion. The reconstruction problem is introduced and solved by forming structured matrices of impulse-response coefficients.

2.1. Difference equations and transfer functions

Discrete-time linear filters are most frequently encountered in the form of difference equations that relate an input signal uk to an output signal yk. A simple example is an output yk determined by a weighted sum of the inputs from uk to uk-m,


More commonly, the output yk also contains a weighted sum of previous outputs, such as a weighted sum of samples from yk-1 to yk-n,


The impulse response of a filter is the output sequence gk=yk generated from an input

The parameters gk are the impulse-response coefficients, and they completely describe the behavior of an LTI filter through the convolution


Filters of type () are called finite-impulse response (FIR) filters because gk is a finite-length sequence that settles to 0 once k>m. Filters of type () are called infinite impulse response (IIR) filters since generally the impulse response will never completely settle to 0.

A system is stable if a bounded uk results in a bounded yk. Because the output of LTI filters is a linear combination of the input and previous output, any input-output sequence can be formed from a linear superposition of other input-output sequences. Hence proving that the system has a bounded output for a single input sequence is necessary and sufficient to prove the stability of an LTI filter. The simplest input to consider is an impulse, and so a suitable definition of system stability is that the absolute sum of the impulse response is bounded,

Though the impulse response completely describes the behavior of an LTI filter, it does so with an infinite number of parameters. For this reason, discrete-time LTI filters are often written as transfer functions of a complex variable z. This enables analysis of filter stability and computation of the filter's frequency response in a finite number of calculations, and it simplifies convolution operations into basic polynomial algebra.

The transfer function is found by grouping output and input terms together and taking the Z-transform of both signals. Let Y(z)=k=-ykz-k be the Z-transform of yk and U(z) be the Z-transform of uk. From the property

the relationship between Y(z) and U(z) may be expressed in polynomials of z as


The ratio of these two polynomials is the filter's transfer function


When nm, G(z) is proper. If the transfer function is not proper, then the difference equations will have yk dependent on future input samples such as uk+1. Proper transfer functions are required for causality, and thus all physical systems have proper transfer function representations. When n>m, the system is strictly proper. Filters with strictly proper transfer functions have no feed-through terms; the output yk does not depend on uk, only the preceding input uk-1,uk-2,. In this chapter, we assume all systems are causal and all transfer functions proper.

If a(z) and b(z) have no common roots, then the rational function G(z) is coprime, and the order n of G(z) cannot be reduced. Fractional representations are not limited to single-input-single-output systems. For vector-valued input signals uknu and output signals ykny, an LTI filter may be represented as an ny×nu matrix of rational functions Gij(z), and the system will have matrix-valued impulse-response coefficients. For simplicity, we will assume that transfer function representations are single-input-single-output, though all results presented here generalize to the multi-input-multi-output case.

2.2. Stability of transfer function representations

Because the effect of b(z) is equivalent to a finite-impulse response filter, the only requirement for b(z) to produce a stable system is that its coefficients be bounded, which we may safely assume is always the case. Thus the stability of a transfer function G(z) is determined entirely by a(z), or more precisely, the roots of a(z). To see this, suppose a(z) is factored into its roots, which are the poles pi of G(z),


To guarantee a bounded yk, it is sufficient to study a single pole, which we will denote simply as p. Thus we wish to determine necessary and sufficient conditions for stability of the system

Note that p may be complex. Assume that |z|>|p|. G'(z) then has the Laurent-series expansion


From the time-shift property of the z-transform, it is immediately clear that the sequence


is the impulse response of G'(z). If we require that () is absolutely summable and let |z|=1, the result is the original stability requirement (), which may be written in terms of p as


This is true if and only if |p|<1, and thus G'(z) is stable if and only if |p|<1. Finally, from () we may deduce that a system is stable if and only if all the poles of G(z) satisfy the property |pi|<1.

2.3. Construction of transfer functions from impulse responses

Transfer functions are a convenient way of representing complex system dynamics in a finite number of parameters, but the coefficients of a(z) and b(z) cannot be measured directly. The impulse response of a system can be found experimentally by either direct measurement or from other means such as taking the inverse Fourier transform of a measured frequency response [11]. It cannot, however, be represented in a finite number of parameters. Thus the conversion between transfer functions and impulse responses is an extremely useful tool.

For a single-pole system such as (), the expansion () provides an obvious means of reconstructing a transfer function from a measured impulse response: given any 2 sequential impulse-response coefficients gk and gk+1, the pole of G'(z) may be found from

Notice that this is true for any k, and the impulse response can be said to have a shift-invariant property in this respect.

Less clear is the case when an impulse response is generated by a system with higher-order a(z) and b(z). In fact, there is no guarantee that an arbitrary impulse response is the result of a linear system of difference equations at all. For an LTI filter, however, the coefficients of the impulse response exhibit a linear dependence which may be used to not only verify the linearity of the system, but to construct a transfer function representation as well. The exact nature of this linear dependence may be found by forming a structured matrix of impulse response coefficients and examining its behavior when the indices of the coefficients are shifted forward by a single increment, similar to the single-pole case in (). The result is stated in the following theorem, originally due to Kronecker [4] and adopted from the English translation of [12].

Theorem 1 (Kronecker's Theorem) Suppose G(z): is an infinite series of descending powers of z, starting with z-1,


Assume G(z) is analytic (the series converges) for all |z|>1. Let H be an infinitely large matrix of the form


Then H has finite rank n if and only if G(z) is a strictly proper, coprime, rational function of degree n with poles inside the unit circle. That is, G(z) has an alternative representation


in which m<n, all roots of a(z) satisfy |z|<1, a(z) and b(z) have no common roots, and we have assumed without loss of generality that a(z) is monic.

To prove Theorem , we first prove that for k>n, gk must be linearly dependent on the previous n terms of the series for H to have finite rank.

Theorem 2 The infinitely large matrix H is of finite rank n if and only if there exists a finite sequence α1,α2,,αn such that for kn,


and n is the smallest number with this property.

Let hk be the row of H beginning with gk. If H has rank n, then the first n+1 rows of H are linearly dependent. This implies that for some 1pn, hp+1 is a linear combination of h1,,hp, and thus there exists some sequence αk such that


The structure and infinite size of H imply that such a relationship must hold for all following rows of H, so that for q0


Hence any row hk, k>p, can be expressed as a linear combination of the previous p rows. Since H has at least n linearly independent rows, p=n, and since this applies element-wise, error(H)=n implies ().

Alternatively, () implies a relationship of the form () exists, and hence error(H)=p. Since n is the smallest possible p, this implies error(H)=n.

We now prove Theorem . Suppose G(z) is a coprime rational function of the form () with series expansion (), which we know exists, since G(z) is analytic for |z|<1. Without loss of generality, let m=n-1, since we may always let bk=0 for some k. Hence


Multiplying both sides by the denominator of the left,


and equating powers of z, we find


From this, we have, for kn,


which not only shows that () holds, but also shows that αj=-aj. Hence by Theorem , H must have finite rank.

Conversely, suppose H has finite rank. Then () holds, and we may construct a(z) from αk and b(z) from () to create a rational function. This function must be coprime since its order n is the smallest possible.

The construction in Theorem is simple to extend to the case in which G(z) is only proper and not strictly proper; the additional coefficient bn is simply the feed-through term in the impulse response, that is, g0.

A result of Theorem is that given finite-dimensional, full-rank matrices




the coefficients of a(z) may be calculated as


Notice that () is in fact a special case of (). Thus we need only know the first 2n+1 impulse-response coefficients to reconstruct the transfer function G(z): 2n to form the matrices Hk and Hk+1 from () and (), respectively, and possibly the initial coefficient g0 in case of an nth-order b(z).

Matrices with the structure of H are useful enough to have a special name. A Hankel matrix H is a matrix constructed from a sequence {hk} so that each element H(j,k)=hj+k. For the Hankel matrix in (), hk=gk-1. Hk also has an interesting property implied by (): its row space is invariant under shifting of the index k. Because its symmetric, this is also true for its column space. Thus this matrix is also often referred to as being shift-invariant.

While () provides a potential method of identifying a system from a measured impulse response, this is not a reliable method to use with measured impulse response coefficients that are corrupted by noise. The exact linear dependence of the coefficients will not be identical for all k, and the structure of () will not be preserved. Inverting Hk will also invert any noise on gk, potentially amplifying high-frequency noise content. Finally, the system order n is required to be known beforehand, which is usually not the case if only an impulse response is available. Fortunately, these difficulties may all be overcome by reinterpreting the results Kronecker's theorem in a state-space framework. First, however, we more carefully examine the role of the Hankel matrix in the behavior of LTI filters.

2.4. Hankel and Toeplitz operators

The Hankel matrix of impulse response coefficients () is more than a tool for computing the transfer function representation of a system from its impulse response. It also defines the mapping of past input signals to future output signals. To define exactly what this means, we write the convolution of () around sample k=0 in matrix form as


where the vectors and matrix have been partitioned into sections for k<0 and k0. The output for k0 may then be split into two parts:


where the subscripts p and f denote “past” and “future,” respectively. The system Hankel matrix H has returned to describe the effects of the past input up on the future output yf. Also present is the matrix T, which represents the convolution of future input uf with the impulse response. Matrices such as T with constant diagonals are called Toeplitz matrices.

From (), it can be seen that H defines the effects of past input on future output. One interpretation of this is that H represents the “memory” of the system. Because H is a linear mapping from up to yf, the induced matrix 2-norm of H, H2, can be considered a function norm, and in a sense, H2 is a measure of the “gain” of the system. H2 is often called the Hankel-norm of a system, and it plays an important role in model reduction and in the analysis of anti-causal systems. More information on this aspect of linear systems can be found in the literature of robust control, for instance, [13].

3. State-space representations

Although transfer functions define system behavior completely with a finite number of parameters and simplify frequency-response calculations, they are cumbersome to manipulate when the input or output is multi-dimensional or when initial conditions must be considered. The other common representation of LTI filters is the state-space form


in which xkn is the system state. The matrices An×n, Bn×nu, Cny×n, and Dny×nu completely parameterize the system. Only D uniquely defines the input-output behavior; any nonsingular matrix T' may be used to change the state basis via the relationships


The Z-transform may also be applied to the state-space equations () to find Z[xk+1] = A Z[xk] + B Z[uk] X(z) z = A X(z) + B U(z)

Z[yk] = C Z[xk] + D Z[uk] Y(z) = C X(z) + D U(z)


and thus, if () is the state-space representation of the single-variable system (), then a(z) is the characteristic polynomial of A, det(zI-A).

Besides clarifying the effect of initial conditions on the output, state-space representations are inherently causal, and () will always result in a proper system (strictly proper if D=0). For this reason, state-space representations are often called realizable descriptions; while the forward-time-shift of z is an inherently non-causal operation, state-space systems may always be constructed in reality.

3.1. Stability, controllability, and observability of state-space representations

The system impulse response is simple to formulate in terms of the state-space parameters by calculation of the output to a unit impulse with x0=0:


Notice the similarity of () and (). In fact, from the eigenvalue decomposition of A,

we find


The term |Λk-1| will only converge if the largest eigenvalue of A is within the unit circle, and thus the condition that all eigenvalues λi of A satisfy |λi|<1 is a necessary and sufficient condition for stability.

For state-space representations, there is the possibility that a combination of A and B will result in a system for which xk cannot be entirely controlled by the input uk. Expressing xk in a matrix-form similar to () as


demonstrates that xk is in subspace n if and only if 𝒞 has rank n. 𝒞 is the controllability matrix and the system is controllable if it has full row rank.

Similarly, the state xk may not uniquely determine the output for some combinations of A and C. Expressing the evolution of the output as a function of the state in matrix-form as


demonstrates that there is no nontrivial null space in the mapping from xk to yk if and only if 𝒪 has rank n. 𝒪 is the observability matrix and the system is observable if it has full column rank.

Systems that are both controllable and observable are called minimal, and for minimal systems, the dimension n of the state variable cannot be reduced. In the next section we show that minimal state-space system representations convert to coprime transfer functions that are found through ().

3.2. Construction of state-space representations from impulse responses

The fact that the denominator of G(z) is the characteristic polynomial of A not only allows for the calculation of a transfer function from a state-space representation, but provides an alternative version of Kronecker's theorem for state-space systems, known as the Ho-Kalman Algorithm [3]. From the Caley-Hamilton theorem, if a(z) is the characteristic polynomial of A, then a(A)=0, and


which implies


Indeed, substitution of () into () and rearrangement of the indices leads to (). Additionally, substitution of () into the product of 𝒪 and 𝒞 shows that


which confirms our previous statement that H effectively represents the memory of the system. Because


we see that error(H)=n implies the state-space system () is minimal.

If the entries of H are shifted forward by one index to form


then once again substituting () reveals

Thus the row space and column space of H are invariant under a forward-shift of the indices, implying the same shift-invariant structure seen in ().

The appearance of A in () hints at a method for constructing a state-space realization from an impulse response. Suppose the impulse response is known exactly, and let Hr be a finite slice of H with r block rows and L columns,


Then any appropriately dimensioned factorization


may be used to find A for some arbitrary state basis as


where H¯r is Hr with the indices shifted forward once and (·) is the Moore-Penrose pseudoinverse. C taken from the first block row of 𝒪r, B taken from the first block column of 𝒞L, and D taken from g0 then provides a complete and minimal state-space realization from an impulse response. Because Hr has rank n and det(zI-A) has degree n, we know from Kronecker's theorem that G(z) taken from () will be coprime.

However, as mentioned before, the impulse response of the system is rarely known exactly. In this case only an estimate H^r with a non-deterministic error term is available:

Because E is non-deterministic, H^ will always have full rank, regardless of the number of rows r. Thus n cannot be determined from examining the rank of H, and even if n is known beforehand, a factorization () for r>n will not exist. Thus we must find a way of reducing the rank of H^r in order to find a state-space realization.

3.3. Rank-reduction of the Hankel matrix estimate

If H^r has full rank, or if n is unknown, its rank must be reduced prior to factorization. The obvious tool for reducing the rank of matrices is the singular-value decomposition (SVD). Assume for now that n is known. The SVD of H^r is

where U and VT are orthogonal matrices and Σ is a diagonal matrix containing the nonnegative singular values σi ordered from largest to smallest. The SVD for a matrix is unique and guaranteed to exist, and the number of nonzero singular values of a matrix is equal to its rank [14].

Because U and VT are orthogonal, the SVD satisfies


where ·2 is the induced matrix 2-norm, and


where ·F is the Frobenius norm. Equation () also shows that the Hankel norm of a system is the maximum singular value of Hr. From () and (), we can directly see that if the SVD of Hr is partitioned into


where Un is the first n columns of U, Σn is the upper-left n×n block of Σ, and VnT is the first n rows of VT, the solution to the rank-reduction problem is [14]


Additionally, the error resulting from the rank reduction is

which suggests that if the rank of Hr is not known beforehand, it can be determined by examining the nonzero singular values in the deterministic case or by searching for a significant drop-off in singular values if only a noise-corrupted estimate is available.

3.4. Identifying the state-space realization

From a rank-reduced H^r, any factorization

can be used to estimate 𝒪r and 𝒞L. The error in the state-space realization, however, will depend on the chosen state basis. Generally we would like to have a state variable with a norm xk2 in between uk2 and yk2. As first proposed in [5], choosing the factorization


results in


and thus, from a functional perspective, the mappings from input to state and state to output will have equal magnitudes, and each entry of the state vector xk will have similar magnitudes. State-space realizations that satisfy () are sometimes called internally balanced realizations [11]. (Alternative definitions of a “balanced” realization exist, however, and it is generally wise to verify the definition in each context.)

Choosing the factorization () also simplifies computation of the estimate A^, since


By estimating B^ as the first block column of 𝒞^L, C^ as the first block row of 𝒪^L, and D^ as g0, a complete state-space realization (A^,B^,C^,D^) is identified from this method.

3.5. Pitfalls of direct realization from an impulse response

Even though the rank-reduction process allows for realization from a noise-corrupted estimate of an impulse response, identification methods that generate a system estimate from a Hankel matrix constructed from an estimated impulse response have numerous difficulties when applied to noisy measurements. Measuring an impulse response directly is often infeasible; high-frequency damping may result in a measurement that has a very brief response before the signal-to-noise ratio becomes prohibitively small, and a unit pulse will often excite high-frequency nonlinearities that degrade the quality of the resulting estimate.

Taking the inverse Fourier transform of the frequency response guarantees that the estimates of the Markov parameters will converge as the dataset grows only so long as the input is broadband. Generally input signals decay in magnitude at higher frequencies, and calculation of the frequency response by inversion of the input will amplify high-frequency noise. We would prefer an identification method that is guaranteed to provide a system estimate that converges to the true system as the amount of data measured increases and that avoids inverting the input. Fortunately, the relationship between input and output data in () may be used to formulate just such an identification procedure.

4. Realization from input-output data

To avoid the difficulties in constructing a system realization from an estimated impulse response, we will form a realization-based identification procedure applicable to measured input-output data. To sufficiently account for non-deterministic effects in measured data, we add a noise term vkny to the output to form the noise-perturbed state-space equations


We assume that the noise signal vk is generated by a stationary stochastic process, which may be either white or colored. This includes the case in which the state is disturbed by process noise, so that the noise process may have the same poles as the deterministic system. (See [15] for a thorough discussion of representations of noise in the identification context.)

4.1. Data-matrix equations

The goal is to construct a state-space realization using the relationships in (), but doing so requires a complete characterization of the row space of Hr. To this end, we expand a finite-slice of the future output vector to form a block-Hankel matrix of output data with r block rows,


This matrix is related to a block-Hankel matrix of future input data


a block-Toeplitz matrix of past input data


a finite-dimensional block-Toeplitz matrix


the system Hankel matrix H, and a block-Hankel matrix V formed from noise data vk with the same indices as Y by the equation

If the entries of Yf are shifted forward by one index to form


then Y¯f is related to the shifted system Hankel matrix H¯, the past input data Up, T with a block column appended to the left, and Uf with a block row appended to the bottom,


and a block-Hankel matrix V¯ of noise data vk with the same indices as Y¯ by the equation


From (), the state is equal to the column vectors of Up multiplied by the entries of the controllability matrix 𝒞, which we may represent as the block-matrix


which is an alternative means of representing the memory of the system at sample 0, 1, .... The two data matrix equations () and () may then be written as Y = Or X + T Uf + V ,

Y = Or A X + T  Uf + V . Equation () is basis for the field of system identification methods known as subspace methods. Subspace identification methods typically fall into one of two categories. First, because a shifted observability matrix




where im(·) of denotes the row space (often called the “image”), the row-space of 𝒪 is shift-invariant, and A may be identified from estimates 𝒪r and 𝒪¯r as


Alternatively, because a forward-propagated sequence of states


the column-space of X is shift-invariant, and A may be identified from estimates X^ and X¯^ as

In both instances, the system dynamics are estimated by propagating the indices forward by one step and examining a propagation of linear dynamics, not unlike () from Kronecker's theorem. Details of these methods may be found in [16] and [17]. In the next section we present a system identification method that constructs system estimates from the shift-invariant structure of Y itself.

4.2. Identification from shift-invariance of output measurements

Equations () and () still contain the effects of the future input in U¯f. To remove these effects from the output, we must first add some assumptions about U¯f. First, we assume that U¯f has full row rank. This is true for any Uf with a smooth frequency response or if U¯f is generated from some pseudo-random sequence. Next, we assume that the initial conditions in X do not somehow cancel out the effects of future input. A sufficient condition for this is to require

to have full row rank. Although these assumptions might appear restrictive at first, since it is impossible to verify without knowledge of X, it is generally true with the exception of some pathological cases.

Next, we form the null-space projector matrix


which has the property

We know the inverse of (U¯fU¯fT) exists, since we assume U¯f has full row rank. Projector matrices such as () have many interesting properties. Their eigenvalues are all 0 or 1, and if they are symmetric, they separate the subspace of real vectors — in this case, vectors in L+1 — into a subspace and its orthogonal complement. In fact, it is simple to verify that the null space of U¯f contains the null space of Uf as a subspace, since


Thus multiplication of () and () on the right by Π results in

Y = Or X + V ,

Y = Or A X + V . It is also unnecessary to compute the projected products YΠ and Y¯Π directly, since from the QR-decomposition


we have


and U=R11TQ1T. Substitution into () reveals

Because the columns of Q1 and Q2 are orthogonal, multiplication of () on the right by () results in

A similar result holds for Y¯Π. Taking the QR-decomposition of the data can alternatively be thought of as using the principle of superposition to construct new sequences of input-output data through a Gram-Schmidt-type orthogonalization process. A detailed discussion of this interpretation can be found in [18].

Thus we have successfully removed the effects of future input on the output while retaining the effects of the past, which is the foundation of the realization process. We still must account for non-deterministic effects in V and V¯. To do so, we look for some matrix Z such that

V ZT 0 ,

V ZT 0 . This requires the content of Z to be statistically independent of the process that generates vk. The input uk is just such a signal, so long as the filter output is not a function of the input — that is, the data was measured in open-loop operation,. If we begin measuring input before k=0 at some sample k=-ζ and construct Z as a block-Hankel matrix of past input data,


then multiplication of () and () on the right by ZT results in

Y ZT Or X ZT ,

Y ZT Or A ZT , as L. Note the term 1L in Z is necessary to keep () and () bounded.

Finally we are able to perform our rank-reduction technique directly on measured data without needing to first estimate the impulse response. From the SVD

we may estimate the order n by looking for a sudden decrease in singular values. From the partitioning


we may estimate 𝒪r and XΠZT from the factorization


A may then be estimated as


And so we have returned to our original relationship ().

While C may be estimated from the top block row of 𝒪^r, our projection has lost the column space of Hr that we previously used to estimate B, and initial conditions in X prevent us from estimating D directly. Fortunately, if A and C are known, then the remaining terms B, D, and an initial condition x0 are linear in the input output data, and may be estimated by solving a linear-least-squares problem.

4.3. Estimation of B, D, and x0

The input-to-state terms B and D may be estimated by examining the convolution with the state-space form of the impulse response. Expanding () with the input and including an initial condition x0 results in


Factoring out B and D on the right provides


in which vec(·) is the operation that stacks the columns of a matrix on one another, is the (coincidentally named) Kronecker product, and we have made use of the identity


Grouping the unknown terms together results in


Thus by forming the regressor


from the estimates A^ and C^, estimates of B and D may be found from the least-squares solution of the linear system of N equations


Note that N is arbitrary and does not need to be related in any way to the indices of the data matrix equations. This can be useful, since for large-dimensional systems, the regressor φk may become very computationally expensive to compute.

5. Conclusion

Beginning with the construction of a transfer function from an impulse response, we have constructed a method for identification of state-space realizations of linear filters from measured input-output data, introducing the fundamental concepts of realization theory of linear systems along the way. Computing a state-space realization from measured input-output data requires many tools of linear algebra: projections and the QR-decomposition, rank reduction and the singular-value decomposition, and linear least squares. The principles of realization theory provide insight into the different representations of linear systems, as well as the role of rational functions and series expansions in linear algebra.


1 - Rudolf E. Kalman. On the General Theory of Control Systems. In Proceedings of the First International Congress of Automatic Control, Moscow, 1960. IRE.
2 - Rudolf E. Kalman. Mathematical Description of Linear Dynamical Systems. Journal of the Society for Industrial and Applied Mathematics, Series A: Control, 1(2):152, July 1963.
3 - B. L. Ho and Rudolf E. Kalman. Effective construction of linear state-variable models from input/output functions. Regelungstechnik, 14:545–548, 1966.
4 - Leopold Kronecker. Zur Theorie der Elimination einer Variabeln aus zwei Algebraischen Gleichungen. Monatsberichte der Königlich Preussischen Akademie der Wissenschaften zu Berlin, pages 535–600, 1881.
5 - Paul H. Zeiger and Julia A. McEwen. Approximate Linear Realizations of Given Dimension via Ho's Algorithm. IEEE Transactions on Automatic Control, 19(2):153 – 153, 1971.
6 - Sun-Yuan Kung. A new identification and model reduction algorithm via singular value decomposition. In Proceedings of the 12th Asilomar Conference on Circuits, Systems, and Computers, pages 705–714. IEEE, 1978.
7 - Michel Verhaegen. Identification of the deterministic part of MIMO state space models given in innovations form from input-output data. Automatica, 30(1):61–74, January 1993.
8 - Peter Van Overschee and Bart De Moor. N4SID: Subspace algorithms for the identification of combined deterministic-stochastic systems. Automatica, 30(1):75–93, January 1994.
9 - Jer-Nan Juang and R. S. Pappa. An Eigensystem Realization Algorithm (ERA) for Modal Parameter Identification and Model Reduction. JPL Proc. of the Workshop on Identification and Control of Flexible Space Structures, 3:299–318, April 1985.
10 - Jer-Nan Juang, Minh Q. Phan, Lucas G. Horta, and Richard W. Longman. Identification of Observer/Kalman Filter Markov Parameters: Theory and Experiments. Journal of Guidance, Control, and Dynamics, 16(2):320–329, 1993.
11 - Chi-Tsong Chen. Linear System Theory and Design. Oxford University Press, New York, 1st edition, 1984.
12 - Felix R. Gantmacher. The Theory of Matrices - Volume Two. Chelsea Publishing Company, New York, 1960.
13 - Kemin Zhou, John C. Doyle, and Keith Glover. Robust and Optimal Control. Prentice Hall, August 1995.
14 - G.H. Golub and C.F. Van Loan. Matrix Computations. The Johns Hopkins University Press, Baltimore, Maryland, USA, third edition, 1996.
15 - Lennart Ljung. System Identification: Theory for the User. PTR Prentice Hall Information and System Sciences. Prentice Hall PTR, Upper Saddle River, NJ, 2nd edition, 1999.
16 - Michel Verhaegen and Vincent Verdult. Filtering and System Identification: A Least Squares Approach. Cambridge University Press, New York, 1 edition, May 2007.
17 - Peter Van Overschee and Bart De Moor. Subspace Identification for Linear Systems: Theory, Implementation, Applications. Kluwer Academic Publishers, London, 1996.
18 - Tohru Katayama. Role of LQ Decomposition in Subspace Identification Methods. In Alessandro Chiuso, Stefano Pinzoni, and Augusto Ferrante, editors, Modeling, Estimation and Control, pages 207–220. Springer Berlin / Heidelberg, 2007.