Open access peer-reviewed chapter

Identification of Linear, Discrete-Time Filters via Realization

By Daniel N. Miller and Raymond A. de Callafon

Submitted: November 18th 2011Reviewed: April 27th 2012Published: July 11th 2012

DOI: 10.5772/48293

Downloaded: 1866

1. Introduction

The realization of a discrete-time, linear, time-invariant (LTI) filter from its impulse response provides insight into the role of linear algebra in the analysis of both dynamical systems and rational functions. For an LTI filter, a sequence of output data measured over some finite period of time may be expressed as the linear combination of the past input and the input measured over that same period. For a finite-dimensional LTI filter, the mapping from past input to future output is a finite-rank linear operator, and the effect of past input, that is, the memory of the system, may be represented as a finite-dimensional vector. This vector is the state of the system.

The central idea of realization theory is to first identify the mapping from past input to future output and to then factor it into two parts: a map from the input to the state and another from the state to the output. This factorization guarantees that the resulting system representation is both casual and finite-dimensional; thus it can be physically constructed, or realized.

System identification is the science of constructing dynamic models from experimentally measured data. Realization-based identification methods construct models by estimating the mapping from past input to future output based on this measured data. The non-deterministic nature of the estimation process causes this mapping to have an arbitrarily large rank, and so a rank-reduction step is required to factor the mapping into a suitable state-space model. Both these steps must be carefully considered to guarantee unbiased estimates of dynamic systems.

The foundations of realization theory are primarily due to Kalman and first appear in the landmark paper of [1], though the problem is not defined explicitly until [2], which also coins the term “realization” as being the a state-space model of a linear system constructed from an experimentally measured impulse response. It was [3] that introduced the structured-matrix approach now synonymous with the term “realization theory” by re-interpreting a theorem originally due to [4] in a state-space LTI system framework.

Although Kalman's original definition of “realization” implied an identification problem, it was not until [5] proposed rank-reduction by means of the singular-value decomposition that Ho's method became feasible for use with non-deterministic data sets. The combination of Ho's method and the singular-value decomposition was finally generalized to use with experimentally measured data by Kung in [6].

With the arrival of Kung's method came the birth of what is now known as the field of subspace identification methods. These methods use structured matrices of arbitrary input and output data to estimate a state-sequence from the system. The system is then identified from the propagation of the state over time. While many subspace methods exist, the most popular are the Multivariable Output-Error State Space (MOESP) family, due to [7], and the Numerical Algorithms for Subspace State-Space System Identification (N4SID) family, due to [8]. Related to subspace methods is the Eigensystem Realization Algorithm [9], which applies Kung's algorithm to impulse-response estimates, which are typically estimated through an Observer/Kalman Filter Identification (OKID) algorithm [10].

This chapter presents the central theory behind realization-based system identification in a chronological context, beginning with Kronecker's theorem, proceeding through the work of Kalman and Kung, and presenting a generalization of the procedure to arbitrary sets of data. This journey provides an interesting perspective on the original role of linear algebra in the analysis of rational functions and highlights the similarities of the different representations of LTI filters. Realization theory is a diverse field that connects many tools of linear algebra, including structured matrices, the QR-decomposition, the singular-value decomposition, and linear least-squares problems.

2. Transfer-function representations

We begin by reviewing some properties of discrete-time linear filters, focusing on the role of infinite series expansions in analyzing the properties of rational functions. The reconstruction of a transfer function from an infinite impulse response is equivalent to the reconstruction of a rational function from its Laurent series expansion. The reconstruction problem is introduced and solved by forming structured matrices of impulse-response coefficients.

2.1. Difference equations and transfer functions

Discrete-time linear filters are most frequently encountered in the form of difference equations that relate an input signal ukto an output signal yk. A simple example is an output ykdetermined by a weighted sum of the inputs from ukto uk-m,

yk=bmuk+bm-1uk-1++b0uk-m.uid2

More commonly, the output ykalso contains a weighted sum of previous outputs, such as a weighted sum of samples from yk-1to yk-n,

yk=bmuk+bm-1uk-1++b0uk-m-an-1yk-1-an-2yk-2-+a0yk-n.uid3

The impulse response of a filter is the output sequence gk=ykgenerated from an input

uk=1k=0,0k0.uid4

The parameters gkare the impulse-response coefficients, and they completely describe the behavior of an LTI filter through the convolution

yk=j=0gjuk-j.uid5

Filters of type () are called finite-impulse response (FIR) filters because gkis a finite-length sequence that settles to 0 once k>m. Filters of type () are called infinite impulse response (IIR) filters since generally the impulse response will never completely settle to 0.

A system is stable if a bounded ukresults in a bounded yk. Because the output of LTI filters is a linear combination of the input and previous output, any input-output sequence can be formed from a linear superposition of other input-output sequences. Hence proving that the system has a bounded output for a single input sequence is necessary and sufficient to prove the stability of an LTI filter. The simplest input to consider is an impulse, and so a suitable definition of system stability is that the absolute sum of the impulse response is bounded,

k=0gk<.uid6

Though the impulse response completely describes the behavior of an LTI filter, it does so with an infinite number of parameters. For this reason, discrete-time LTI filters are often written as transfer functions of a complex variable z. This enables analysis of filter stability and computation of the filter's frequency response in a finite number of calculations, and it simplifies convolution operations into basic polynomial algebra.

The transfer function is found by grouping output and input terms together and taking the Z-transform of both signals. Let Y(z)=k=-ykz-kbe the Z-transform of ykand U(z)be the Z-transform of uk. From the property

𝒵yk-1=Y(z)z-1

the relationship between Y(z)and U(z)may be expressed in polynomials of zas

a(z)Y(z)=b(z)U(z).

The ratio of these two polynomials is the filter's transfer function

G(z)=b(z)a(z)=bmzm+bm-1zm-1++b1z+b0zn+an-1zn-1++a1z+a0.uid7

When nm, G(z)is proper. If the transfer function is not proper, then the difference equations will have ykdependent on future input samples such as uk+1. Proper transfer functions are required for causality, and thus all physical systems have proper transfer function representations. When n>m, the system is strictly proper. Filters with strictly proper transfer functions have no feed-through terms; the output ykdoes not depend on uk, only the preceding input uk-1,uk-2,. In this chapter, we assume all systems are causal and all transfer functions proper.

If a(z)and b(z)have no common roots, then the rational function G(z)is coprime, and the order nof G(z)cannot be reduced. Fractional representations are not limited to single-input-single-output systems. For vector-valued input signals uknuand output signals ykny, an LTI filter may be represented as an ny×numatrix of rational functions Gij(z), and the system will have matrix-valued impulse-response coefficients. For simplicity, we will assume that transfer function representations are single-input-single-output, though all results presented here generalize to the multi-input-multi-output case.

2.2. Stability of transfer function representations

Because the effect of b(z)is equivalent to a finite-impulse response filter, the only requirement for b(z)to produce a stable system is that its coefficients be bounded, which we may safely assume is always the case. Thus the stability of a transfer function G(z)is determined entirely by a(z), or more precisely, the roots of a(z). To see this, suppose a(z)is factored into its roots, which are the poles piof G(z),

G(z)=b(z)i=1n(z-pi).uid9

To guarantee a bounded yk, it is sufficient to study a single pole, which we will denote simply as p. Thus we wish to determine necessary and sufficient conditions for stability of the system

G'(z)=1z-p.uid10

Note that pmay be complex. Assume that |z|>|p|. G'(z)then has the Laurent-series expansion

G'(z)=z-111-pz-1=z-1k=0pkz-k=k=1pk-1z-k.uid11

From the time-shift property of the z-transform, it is immediately clear that the sequence

gk'=0k=1,pk-1k>1,uid12

is the impulse response of G'(z). If we require that () is absolutely summable and let |z|=1, the result is the original stability requirement (), which may be written in terms of pas

k=1pk-1<.

This is true if and only if |p|<1, and thus G'(z)is stable if and only if |p|<1. Finally, from () we may deduce that a system is stable if and only if all the poles of G(z)satisfy the property |pi|<1.

2.3. Construction of transfer functions from impulse responses

Transfer functions are a convenient way of representing complex system dynamics in a finite number of parameters, but the coefficients of a(z)and b(z)cannot be measured directly. The impulse response of a system can be found experimentally by either direct measurement or from other means such as taking the inverse Fourier transform of a measured frequency response [11]. It cannot, however, be represented in a finite number of parameters. Thus the conversion between transfer functions and impulse responses is an extremely useful tool.

For a single-pole system such as (), the expansion () provides an obvious means of reconstructing a transfer function from a measured impulse response: given any 2 sequential impulse-response coefficients gkand gk+1, the pole of G'(z)may be found from

p=gk-1gk+1.uid14

Notice that this is true for any k, and the impulse response can be said to have a shift-invariant property in this respect.

Less clear is the case when an impulse response is generated by a system with higher-order a(z)and b(z). In fact, there is no guarantee that an arbitrary impulse response is the result of a linear system of difference equations at all. For an LTI filter, however, the coefficients of the impulse response exhibit a linear dependence which may be used to not only verify the linearity of the system, but to construct a transfer function representation as well. The exact nature of this linear dependence may be found by forming a structured matrix of impulse response coefficients and examining its behavior when the indices of the coefficients are shifted forward by a single increment, similar to the single-pole case in (). The result is stated in the following theorem, originally due to Kronecker [4] and adopted from the English translation of [12].

Theorem 1 (Kronecker's Theorem) Suppose G(z):is an infinite series of descending powers of z, starting with z-1,

G(z)=g1z-1+g2z-2+g3z-3+=k=1gkz-k.uid16

Assume G(z) is analytic (the series converges) for all |z|>1. Let Hbe an infinitely large matrix of the form

H=g1g2g3g2g3g4g3g4g5uid17

Then Hhas finite rank nif and only if G(z)is a strictly proper, coprime, rational function of degree nwith poles inside the unit circle. That is, G(z)has an alternative representation

G(z)=b(z)a(z)=bmzm+bm-1zm-1++b1z+b0zn+an-1zn-1++a1z+a0,uid18

in which m<n, all roots of a(z)satisfy |z|<1, a(z)and b(z)have no common roots, and we have assumed without loss of generality that a(z)is monic.

To prove Theorem , we first prove that for k>n, gkmust be linearly dependent on the previous nterms of the series for Hto have finite rank.

Theorem 2 The infinitely large matrix His of finite rank nif and only if there exists a finite sequence α1,α2,,αnsuch that for kn,

gk+1=j=1nαjgk-j+1,uid20

and nis the smallest number with this property.

Let hkbe the row of Hbeginning with gk. If Hhas rank n, then the first n+1rows of Hare linearly dependent. This implies that for some 1pn, hp+1is a linear combination of h1,,hp, and thus there exists some sequence αksuch that

hp+1=j=1pαjhp-j+1.uid21

The structure and infinite size of Himply that such a relationship must hold for all following rows of H, so that for q0

hq+p+1=j=1pαjhq+p-j+1.

Hence any row hk, k>p, can be expressed as a linear combination of the previous prows. Since Hhas at least nlinearly independent rows, p=n, and since this applies element-wise, error(H)=nimplies ().

Alternatively, () implies a relationship of the form () exists, and hence error(H)=p. Since nis the smallest possible p, this implies error(H)=n.

We now prove Theorem .Suppose G(z)is a coprime rational function of the form () with series expansion (), which we know exists, since G(z)is analytic for |z|<1. Without loss of generality, let m=n-1, since we may always let bk=0for some k. Hence

bn-1zn-1+bn-2zn-2++b1z+b0zn+an-1zn-1++a1z+a0=g1z-1+g2z-2+g3z-3+

Multiplying both sides by the denominator of the left,

bn-1zn-1+bn-2zn-2++b1z+b0=g1zn-1+(g2+g1an-1)zn-2+(g3+g2an-1+g1an-2)zn-3+,

and equating powers of z, we find

bn-1=g1bn-2=g2+g1an-1bn-3=g3+g2an-1+g1an-2=b1=gn-1+gn-2an-1++g1a2b0=gn+gn-1an-1++g1a10=gk+1+gkan-1++gk-n+1a0kn.uid22

From this, we have, for kn,

gk+1=j=1n-ajgk-j+1,

which not only shows that () holds, but also shows that αj=-aj. Hence by Theorem , Hmust have finite rank.

Conversely, suppose Hhas finite rank. Then () holds, and we may construct a(z)from αkand b(z)from () to create a rational function. This function must be coprime since its order nis the smallest possible.

The construction in Theorem is simple to extend to the case in which G(z)is only proper and not strictly proper; the additional coefficient bnis simply the feed-through term in the impulse response, that is, g0.

A result of Theorem is that given finite-dimensional, full-rank matrices

Hk=gkgk+1gk+n-1gk+1gk+2gk+ngk+n-1gk+ngk+2n-2uid23

and

Hk+1=gk+1gk+2gk+ngk+2gk+3gk+n+1gk+ngk+n+1gk+2n-1,uid24

the coefficients of a(z)may be calculated as

000-a0100-a1010-a2001-an-1=Hk-1Hk+1.uid25

Notice that () is in fact a special case of (). Thus we need only know the first 2n+1impulse-response coefficients to reconstruct the transfer function G(z): 2nto form the matrices Hkand Hk+1from () and (), respectively, and possibly the initial coefficient g0in case of an nth-order b(z).

Matrices with the structure of Hare useful enough to have a special name. A Hankel matrix His a matrix constructed from a sequence {hk}so that each element H(j,k)=hj+k. For the Hankel matrix in (), hk=gk-1. Hkalso has an interesting property implied by (): its row space is invariant under shifting of the index k. Because its symmetric, this is also true for its column space. Thus this matrix is also often referred to as being shift-invariant.

While () provides a potential method of identifying a system from a measured impulse response, this is not a reliable method to use with measured impulse response coefficients that are corrupted by noise. The exact linear dependence of the coefficients will not be identical for all k, and the structure of () will not be preserved. Inverting Hkwill also invert any noise on gk, potentially amplifying high-frequency noise content. Finally, the system order nis required to be known beforehand, which is usually not the case if only an impulse response is available. Fortunately, these difficulties may all be overcome by reinterpreting the results Kronecker's theorem in a state-space framework. First, however, we more carefully examine the role of the Hankel matrix in the behavior of LTI filters.

2.4. Hankel and Toeplitz operators

The Hankel matrix of impulse response coefficients () is more than a tool for computing the transfer function representation of a system from its impulse response. It also defines the mapping of past input signals to future output signals. To define exactly what this means, we write the convolution of () around sample k=0in matrix form as

y-3y-2y-1y0y1y2=0g0g1g0g2g1g0g3g2g1g0g4g3g2g1g0g5g4g3g2g1g0u-3u-2u-1u0u1u2,

where the vectors and matrix have been partitioned into sections for k<0and k0. The output for k0may then be split into two parts:

y0y1y2yf=g1g2g3g2g3g4g3g4g5Hu-1u-2u-3up+g00g1g0g2g1g0Tu0u1u2uf,uid27

where the subscripts pand fdenote “past” and “future,” respectively. The system Hankel matrix Hhas returned to describe the effects of the past input upon the future output yf. Also present is the matrix T, which represents the convolution of future input ufwith the impulse response. Matrices such as Twith constant diagonals are called Toeplitz matrices.

From (), it can be seen that Hdefines the effects of past input on future output. One interpretation of this is that Hrepresents the “memory” of the system. Because His a linear mapping from upto yf, the induced matrix 2-norm of H, H2, can be considered a function norm, and in a sense, H2is a measure of the “gain” of the system. H2is often called the Hankel-norm of a system, and it plays an important role in model reduction and in the analysis of anti-causal systems. More information on this aspect of linear systems can be found in the literature of robust control, for instance, [13].

3. State-space representations

Although transfer functions define system behavior completely with a finite number of parameters and simplify frequency-response calculations, they are cumbersome to manipulate when the input or output is multi-dimensional or when initial conditions must be considered. The other common representation of LTI filters is the state-space form

xk+1=Axk+Bukyk=Cxk+Duk,uid28

in which xknis the system state. The matrices An×n, Bn×nu, Cny×n, and Dny×nucompletely parameterize the system. Only Duniquely defines the input-output behavior; any nonsingular matrix T'may be used to change the state basis via the relationships

x'=T'xA'=T'AT'-1B'=T'BC'=CT'-1.

The Z-transform may also be applied to the state-space equations () to findZ[xk+1]= AZ[xk] + BZ[uk]X(z) z= A X(z) + B U(z)

Z[yk]= CZ[xk] + DZ[uk]Y(z)= C X(z) + D U(z)

Y(z)U(z)=G(z)G(z)=CzI-A-1B+D,uid29

and thus, if () is the state-space representation of the single-variable system (), then a(z)is the characteristic polynomial of A, det(zI-A).

Besides clarifying the effect of initial conditions on the output, state-space representations are inherently causal, and () will always result in a proper system (strictly proper if D=0). For this reason, state-space representations are often called realizable descriptions; while the forward-time-shift of zis an inherently non-causal operation, state-space systems may always be constructed in reality.

3.1. Stability, controllability, and observability of state-space representations

The system impulse response is simple to formulate in terms of the state-space parameters by calculation of the output to a unit impulse with x0=0:

gk=Dk=0,CAk-1Bk>0.uid31

Notice the similarity of () and (). In fact, from the eigenvalue decomposition of A,

A=VΛV-1,

we find

k=1gk=k=1CAk-1B=k=1CV(|Λk-1|)V-1B.

The term |Λk-1|will only converge if the largest eigenvalue of Ais within the unit circle, and thus the condition that all eigenvalues λiof Asatisfy |λi|<1is a necessary and sufficient condition for stability.

For state-space representations, there is the possibility that a combination of Aand Bwill result in a system for which xkcannot be entirely controlled by the input uk. Expressing xkin a matrix-form similar to () as

xk=𝒞uk-1uk-2uk-3,𝒞=BABA2Buid32

demonstrates that xkis in subspace nif and only if 𝒞has rank n. 𝒞is the controllability matrix and the system is controllable if it has full row rank.

Similarly, the state xkmay not uniquely determine the output for some combinations of Aand C. Expressing the evolution of the output as a function of the state in matrix-form as

ykyk+1yk+2=𝒪xk,𝒪=CCACA2

demonstrates that there is no nontrivial null space in the mapping from xkto ykif and only if 𝒪has rank n. 𝒪is the observability matrix and the system is observable if it has full column rank.

Systems that are both controllable and observable are called minimal, and for minimal systems, the dimension nof the state variable cannot be reduced. In the next section we show that minimal state-space system representations convert to coprime transfer functions that are found through ().

3.2. Construction of state-space representations from impulse responses

The fact that the denominator of G(z)is the characteristic polynomial of Anot only allows for the calculation of a transfer function from a state-space representation, but provides an alternative version of Kronecker's theorem for state-space systems, known as the Ho-Kalman Algorithm [3]. From the Caley-Hamilton theorem, if a(z)is the characteristic polynomial of A, then a(A)=0, and

CAka(A)B=CAkAn+an-1An-1++a1A+a0B=CAk+nB+j=0n-1ajCAk+jB,

which implies

CAk+nB=-j=0n-1ajCAk+jB.uid34

Indeed, substitution of () into () and rearrangement of the indices leads to (). Additionally, substitution of () into the product of 𝒪and 𝒞shows that

𝒪𝒞=CBCABCA2BCABCA2BCA3BCA2BCA3BCA4B=g1g2g3g2g3g4g3g4g5=H,

which confirms our previous statement that Heffectively represents the memory of the system. Because

error(H)=min{error(𝒪),error(𝒞)},

we see that error(H)=nimplies the state-space system () is minimal.

If the entries of Hare shifted forward by one index to form

H¯=g2g3g4g3g4g5g4g5g6,

then once again substituting () reveals

H¯=𝒪A𝒞.uid35

Thus the row space and column space of Hare invariant under a forward-shift of the indices, implying the same shift-invariant structure seen in ().

The appearance of Ain () hints at a method for constructing a state-space realization from an impulse response. Suppose the impulse response is known exactly, and let Hrbe a finite slice of Hwith rblock rows and Lcolumns,

Hr=g1g2g3gLg2g3g4gL+1g3g4g5gL+2gr-1grgr+1gr+L-1.

Then any appropriately dimensioned factorization

Hr=𝒪r𝒞L=CCACA2CAr-1BABA2BAL-1Buid36

may be used to find Afor some arbitrary state basis as

A=𝒪rH¯r𝒞Luid37

where H¯ris Hrwith the indices shifted forward once and (·)is the Moore-Penrose pseudoinverse. Ctaken from the first block row of 𝒪r, Btaken from the first block column of 𝒞L, and Dtaken from g0then provides a complete and minimal state-space realization from an impulse response. Because Hrhas rank nand det(zI-A)has degree n, we know from Kronecker's theorem that G(z)taken from () will be coprime.

However, as mentioned before, the impulse response of the system is rarely known exactly. In this case only an estimate H^rwith a non-deterministic error term is available:

H^r=Hr+E.

Because Eis non-deterministic, H^will always have full rank, regardless of the number of rows r. Thus ncannot be determined from examining the rank of H, and even if nis known beforehand, a factorization () for r>nwill not exist. Thus we must find a way of reducing the rank of H^rin order to find a state-space realization.

3.3. Rank-reduction of the Hankel matrix estimate

If H^rhas full rank, or if nis unknown, its rank must be reduced prior to factorization. The obvious tool for reducing the rank of matrices is the singular-value decomposition(SVD). Assume for now that nis known. The SVD of H^ris

H^r=UΣVT

where Uand VTare orthogonal matrices and Σis a diagonal matrix containing the nonnegative singular valuesσiordered from largest to smallest. The SVD for a matrix is unique and guaranteed to exist, and the number of nonzero singular values of a matrix is equal to its rank [14].

Because Uand VTare orthogonal, the SVD satisfies

H^r=UΣVT2=Σ2=σ1uid41

where ·2is the induced matrix 2-norm, and

H^r=UΣVTF=ΣF=ilσi21/2uid42

where ·Fis the Frobenius norm. Equation () also shows that the Hankel norm of a system is the maximum singular value of Hr. From () and (), we can directly see that if the SVD of Hris partitioned into

H^r=UnUsΣn00ΣsVnTVsT,

where Unis the first ncolumns of U, Σnis the upper-left n×nblock of Σ, and VnTis the first nrows of VT, the solution to the rank-reduction problem is [14]

Q=arg~minerror(Q)=nQ-H^r2=arg~minerror(Q)=nQ-H^rF=UnΣnVnT.

Additionally, the error resulting from the rank reduction is

e=Q-H^r2=σn+1,

which suggests that if the rank of Hris not known beforehand, it can be determined by examining the nonzero singular values in the deterministic case or by searching for a significant drop-off in singular values if only a noise-corrupted estimate is available.

3.4. Identifying the state-space realization

From a rank-reduced H^r, any factorization

H^r=𝒪^r𝒞^L

can be used to estimate 𝒪rand 𝒞L. The error in the state-space realization, however, will depend on the chosen state basis. Generally we would like to have a state variable with a norm xk2in between uk2and yk2. As first proposed in [5], choosing the factorization

𝒪^r=UnΣn1/2and𝒞^L=Σn1/2VnTuid44

results in

𝒪^r2=𝒞^L2=H^r2,uid45

and thus, from a functional perspective, the mappings from input to state and state to output will have equal magnitudes, and each entry of the state vector xkwill have similar magnitudes. State-space realizations that satisfy () are sometimes called internally balanced realizations [11]. (Alternative definitions of a “balanced” realization exist, however, and it is generally wise to verify the definition in each context.)

Choosing the factorization () also simplifies computation of the estimate A^, since

A^=𝒪^rH¯^r𝒞^L=Σn-1/2UnTH¯^rVnΣn-1/2.

By estimating B^as the first block column of 𝒞^L, C^as the first block row of 𝒪^L, and D^as g0, a complete state-space realization (A^,B^,C^,D^)is identified from this method.

3.5. Pitfalls of direct realization from an impulse response

Even though the rank-reduction process allows for realization from a noise-corrupted estimate of an impulse response, identification methods that generate a system estimate from a Hankel matrix constructed from an estimated impulse response have numerous difficulties when applied to noisy measurements. Measuring an impulse response directly is often infeasible; high-frequency damping may result in a measurement that has a very brief response before the signal-to-noise ratio becomes prohibitively small, and a unit pulse will often excite high-frequency nonlinearities that degrade the quality of the resulting estimate.

Taking the inverse Fourier transform of the frequency response guarantees that the estimates of the Markov parameters will converge as the dataset grows only so long as the input is broadband. Generally input signals decay in magnitude at higher frequencies, and calculation of the frequency response by inversion of the input will amplify high-frequency noise. We would prefer an identification method that is guaranteed to provide a system estimate that converges to the true system as the amount of data measured increases and that avoids inverting the input. Fortunately, the relationship between input and output data in () may be used to formulate just such an identification procedure.

4. Realization from input-output data

To avoid the difficulties in constructing a system realization from an estimated impulse response, we will form a realization-based identification procedure applicable to measured input-output data. To sufficiently account for non-deterministic effects in measured data, we add a noise term vknyto the output to form the noise-perturbed state-space equations

xk+1=Axk+Bukyk=Cxk+Duk+vk.uid47

We assume that the noise signal vkis generated by a stationary stochastic process, which may be either white or colored. This includes the case in which the state is disturbed by process noise, so that the noise process may have the same poles as the deterministic system. (See [15] for a thorough discussion of representations of noise in the identification context.)

4.1. Data-matrix equations

The goal is to construct a state-space realization using the relationships in (), but doing so requires a complete characterization of the row space of Hr. To this end, we expand a finite-slice of the future output vector to form a block-Hankel matrix of output data with rblock rows,

Y=y0y1y2yLy1y2y3yL+1y2y3y4yL+2yr-1yryr+1yr+L-1.

This matrix is related to a block-Hankel matrix of future input data

Uf=u0u1u2uLu1u2u3uL+1u2u3u4uL+2ur-1urur+1ur+L-1,

a block-Toeplitz matrix of past input data

Up=u-1u0u1uL-1u-2u-1u0uL-2u-3u-2u-1uL-3,

a finite-dimensional block-Toeplitz matrix

T=g00g1g0g2g1g0gr-1gr-2gr-3g0,

the system Hankel matrix H, and a block-Hankel matrix Vformed from noise data vkwith the same indices as Yby the equation

Y=HUp+TUf+V.uid49

If the entries of Yfare shifted forward by one index to form

Y¯=y1y2y3yL+1y2y3y4yL+2y3y4y5yL+3yryr+1yr+2yr+L,

then Y¯fis related to the shifted system Hankel matrix H¯, the past input data Up, Twith a block column appended to the left, and Ufwith a block row appended to the bottom,

T¯=g1g2g3Tgr,U¯f=Ufurur+1ur+2ur+L,

and a block-Hankel matrix V¯of noise data vkwith the same indices as Y¯by the equation

Y¯=H¯Up+T¯U¯f+V¯.uid50

From (), the state is equal to the column vectors of Upmultiplied by the entries of the controllability matrix 𝒞, which we may represent as the block-matrix

X=x0x1x2xL=𝒞Up,

which is an alternative means of representing the memory of the system at sample 0, 1, .... The two data matrix equations () and () may then be written asY=Or X + T Uf + V ,

Y=Or A X +T Uf +V. Equation () is basis for the field of system identification methods known as subspace methods. Subspace identification methods typically fall into one of two categories. First, because a shifted observability matrix

𝒪¯=CACA2CA3,

satisfies

im(𝒪)=im(𝒪¯),

where im(·)of denotes the row space (often called the “image”), the row-space of 𝒪is shift-invariant, and Amay be identified from estimates 𝒪rand 𝒪¯ras

A^=𝒪^r𝒪¯^r.

Alternatively, because a forward-propagated sequence of states

X¯=AX

satisfies

im(XT)=im(X¯T),

the column-space of Xis shift-invariant, and Amay be identified from estimates X^and X¯^as

A^=X¯^X^.

In both instances, the system dynamics are estimated by propagating the indices forward by one step and examining a propagation of linear dynamics, not unlike () from Kronecker's theorem. Details of these methods may be found in [16] and [17]. In the next section we present a system identification method that constructs system estimates from the shift-invariant structure of Yitself.

4.2. Identification from shift-invariance of output measurements

Equations () and () still contain the effects of the future input in U¯f. To remove these effects from the output, we must first add some assumptions about U¯f. First, we assume that U¯fhas full row rank. This is true for any Ufwith a smooth frequency response or if U¯fis generated from some pseudo-random sequence. Next, we assume that the initial conditions in Xdo not somehow cancel out the effects of future input. A sufficient condition for this is to require

errorXU¯f=n+rnu

to have full row rank. Although these assumptions might appear restrictive at first, since it is impossible to verify without knowledge of X, it is generally true with the exception of some pathological cases.

Next, we form the null-space projector matrix

Π=IL+1-U¯fTU¯fU¯fT-1U¯f,uid52

which has the property

U¯fΠ=0.

We know the inverse of (U¯fU¯fT)exists, since we assume U¯fhas full row rank. Projector matrices such as () have many interesting properties. Their eigenvalues are all 0 or 1, and if they are symmetric, they separate the subspace of real vectors — in this case, vectors in L+1— into a subspace and its orthogonal complement. In fact, it is simple to verify that the null space of U¯fcontains the null space of Ufas a subspace, since

U¯fΠ=UfΠ=0.

Thus multiplication of () and () on the right by Πresults in

Y=Or X+ V,

Y=Or A X+V. It is also unnecessary to compute the projected products YΠand Y¯Πdirectly, since from the QR-decomposition

U¯TYT=Q1Q2R11R120R22,

we have

Y=R12TQ1T+R22TQ2Tuid53

and U=R11TQ1T. Substitution into () reveals

Π=I-Q1Q1T.uid54

Because the columns of Q1and Q2are orthogonal, multiplication of () on the right by () results in

YΠ=R22TQ2T.

A similar result holds for Y¯Π. Taking the QR-decomposition of the data can alternatively be thought of as using the principle of superposition to construct new sequences of input-output data through a Gram-Schmidt-type orthogonalization process. A detailed discussion of this interpretation can be found in [18].

Thus we have successfully removed the effects of future input on the output while retaining the effects of the past, which is the foundation of the realization process. We still must account for non-deterministic effects in Vand V¯. To do so, we look for some matrix Zsuch that

VZT0 ,

VZT0 . This requires the content of Zto be statistically independent of the process that generates vk. The input ukis just such a signal, so long as the filter output is not a function of the input — that is, the data was measured in open-loop operation,. If we begin measuring input before k=0at some sample k=-ζand construct Zas a block-Hankel matrix of past input data,

Z=1Lu-ζu-ζ+1u-ζ+2u-ζ+Lu-ζ+1u-ζ+2u-ζ+3u-ζ+L+1u-ζ+2u-ζ+3u-ζ+4u-ζ+L+2u-1u0u1ur+L-2,

then multiplication of () and () on the right by ZTresults in

YZTOr XZT ,

YZTOr AZT , as L. Note the term 1Lin Zis necessary to keep () and () bounded.

Finally we are able to perform our rank-reduction technique directly on measured data without needing to first estimate the impulse response. From the SVD

YΠZT=UΣVT,

we may estimate the order nby looking for a sudden decrease in singular values. From the partitioning

YΠZT=UnUsΣn00ΣsVnTVsT,

we may estimate 𝒪rand XΠZTfrom the factorization

𝒪^r=UnΣn1/2andX^ΠZT=Σn1/2VnT.

Amay then be estimated as

A^=𝒪^rY¯ΠZTX^ΠZT=Σn-1/2UnTY¯ΠZTVnΣn-1/2𝒪rH¯UpΠ𝒞LUpΠ𝒪rH¯𝒞L.

And so we have returned to our original relationship ().

While Cmay be estimated from the top block row of 𝒪^r, our projection has lost the column space of Hrthat we previously used to estimate B, and initial conditions in Xprevent us from estimating Ddirectly. Fortunately, if Aand Care known, then the remaining terms B, D, and an initial condition x0are linear in the input output data, and may be estimated by solving a linear-least-squares problem.

4.3. Estimation of B, D, and x0

The input-to-state terms Band Dmay be estimated by examining the convolution with the state-space form of the impulse response. Expanding () with the input and including an initial condition x0results in

yk=CAkx0+j=0k-1CAk-j-1Buj+Duk+vk.uid56

Factoring out Band Don the right provides

yk=CAkx0+j=0k-1ukTCAk-j-1vec(B)+ukTInyvec(D)+vk,

in which vec(·)is the operation that stacks the columns of a matrix on one another, is the (coincidentally named) Kronecker product, and we have made use of the identity

vec(AXB)=(BTA)vec(X).

Grouping the unknown terms together results in

yk=CAkj=0k-1ukTCAk-j-1ukTInyx0vec(B)vec(D)+vk.

Thus by forming the regressor

φk=C^A^kj=0k-1ukTC^A^k-j-1ukTIny

from the estimates A^and C^, estimates of Band Dmay be found from the least-squares solution of the linear system of Nequations

y0y1y2yN=φ0φ1φ2φNx^0vec(B^)vec(D^).

Note that Nis arbitrary and does not need to be related in any way to the indices of the data matrix equations. This can be useful, since for large-dimensional systems, the regressor φkmay become very computationally expensive to compute.

5. Conclusion

Beginning with the construction of a transfer function from an impulse response, we have constructed a method for identification of state-space realizations of linear filters from measured input-output data, introducing the fundamental concepts of realization theory of linear systems along the way. Computing a state-space realization from measured input-output data requires many tools of linear algebra: projections and the QR-decomposition, rank reduction and the singular-value decomposition, and linear least squares. The principles of realization theory provide insight into the different representations of linear systems, as well as the role of rational functions and series expansions in linear algebra.

© 2012 The Author(s). Licensee IntechOpen. This chapter is distributed under the terms of the Creative Commons Attribution 3.0 License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

How to cite and reference

Link to this chapter Copy to clipboard

Cite this chapter Copy to clipboard

Daniel N. Miller and Raymond A. de Callafon (July 11th 2012). Identification of Linear, Discrete-Time Filters via Realization, Linear Algebra - Theorems and Applications, Hassan Abid Yasser, IntechOpen, DOI: 10.5772/48293. Available from:

chapter statistics

1866total chapter downloads

More statistics for editors and authors

Login to your personal dashboard for more detailed statistics on your publications.

Access personal reporting

Related Content

This Book

Next chapter

Partition-Matrix Theory Applied to the Computation of Generalized-Inverses for MIMO Systems in Rayleigh Fading Channels

By P. Cervantes, L.F. González, F.J. Ortiz and A.D. García

Related Book

First chapter

Cramer’s Rules for the System of Two-Sided Matrix Equations and of Its Special Cases

By Ivan I. Kyrchei

We are IntechOpen, the world's leading publisher of Open Access books. Built by scientists, for scientists. Our readership spans scientists, professors, researchers, librarians, and students, as well as business professionals. We share our knowledge and peer-reveiwed research papers with libraries, scientific and engineering societies, and also work with corporate R&D departments and government entities.

More About Us