Eigenvalue Problems

Aleksandra Kostić

doi:10.5772/62267

Abstract

In natural sciences and engineering, are often used differential equations and systems of differential equations. Their solution leads to the problem of eigenvalues. Because of that, problem of eigenvalues occupies an important place in linear algebra. In this caption we will consider the problem of eigenvalues, and to linear and quadratic problems of eigenvalues. During the studying of linear problem of eigenvalues, we put emphasis on QR algorithm for unsymmetrical case and on minmax characterization of symmetric case. During the studying of quadratic problems of eingenvalue, we consider the linearization and variational characterization. We illustrate all with practical examples.

Keywords

QR algorithm
min max principle
Rayleigh functional
linearization
eigenvalues

Author Information

Show +

Aleksandra Kostić*
- Faculty of Mechanical Engineering, University of Sarajevo, Sarajevo, Bosnia and Herzegovina

*Address all correspondence to: kostic@mef.unsa.ba

1. Introduction

Every mechanical system has the property of vibration. Analog phenomenon can be observed in the electrical systems in the form of oscillation circuits. Neglecting terms of vibration can lead to resonance, which on one hand can have catastrophic consequences as the demolition of the bridge, on the other hand can be used positively. Mathematical description of vibratory condition leads to a differential equation or a system of differential equations. This problem is further transformed to the eigenvalue problem. This is the motivation for considering the eigenvalue problems.

This chapter is organized as follows: The Linear eigenvalue problem and the quadratic eigenvalue problem.

2. The linear eigenvalue problem

This section considers the linear eigenvalue problem of finding parameter λ such that the linear system

Ax=λxE1

has nontrivial solution x, where A ∈ C^(n,n). The scalar λ is called an eigenvalue of A, and x is an eigenvector of A corresponding to λ. The set of all eigenvalues of matrix A is called the spectrum of A, and is denoted as σ(A)_.

The literature discusses the right and left eigenvectors. In our deliberations, we have been limited to the right eigenvectors, which are earlier defined.

This section is organized as follows:

Basis properties (characteristic polynomial, bases for eigenspaces, eigenvalues and invertibility, diagonalization)
QR Algorithm (The QR algorithm is used for determining all the eigenvalues of a matrix. Today, it is the best method for solving the unsymmetrical eigenvalue problems.)
Mathematical background for Hermitian (symmetric) case (Rayleigh quotient, min max principle of Poincare, minmax principle of Courant-Fischer)
Physical Background
General linear eigenvalue problem

Ax=λBxE2

2.1. Basic properties

In this section we outline the basic concepts and theorems, which will allow us to understand further elaboration. The eigenvalue problem is related to the homogeneous system of linear equations, as we will see in the following discussion.

To find the eigenvalues of n × n matrix A we rewrite (1) as

Ax=λIxE3

or by inserting an identity matrix I equivalently

A−λIx=0.E4

Such system is homogeneous system and the system (3) has a nontrivial solution if and only if

detA−λI=0.E5

This is called the characteristic equation of A; the scalars satisfying this equation are the eigenvalues of A. When is expanded, the determinant det(A − λI) is polynomial p in λ, and it is called the characteristic polynomial of A.

The following theorem gives the link between the characteristic polynomial of the matrix A and its eigenvalues.

Theorem 2.1.1. Equivalent Statements

If A is an n × n matrix and λ is a complex number, then the following are equivalent

λ is an eigenvalue of A.
The system of equations (A − λI)x = 0 has nontrivial solutions.
There is a nonzero vector x in ℂⁿ such that Ax = λx.
λ is a solution of the characteristic equation det(A − λI).

Some coefficients of the characteristic polynomial of A have a specific shape. The following theorem gives the information about it.

Theorem 2.1.2.

If A is an n × n matrix, then the characteristic polynomial p(λ) of A has degree n, the coefficient of λⁿ is (−1)ⁿ, the coefficient of λ^n − 1 is (−1)^n − 1 trace(A) and the constant term is det(A), where trace(A) := a₁₁ + a₂₂ + ⋯ + a_nn.

In some structured matrices, eigenvalues can be read as shown in Theorem 2.1.3.

Theorem 2.1.3.

If A is an n × n triangular matrix (upper triangular, lower triangular, or diagonal), then the eigenvalues of A are entries of the main diagonal of A.

Cayley-Hamilton’s theorem is one of the most important statements in linear algebra. The theorem states:

Theorem 2.1.4.

Substituting the matrix A for λ in characteristic polynomial of A, we get the result of zero matrix i.e., p(A) = 0.

There are a number of methods for determining eigenvalue. Some methods allow finding all the eigenvalues and the other just a few of the eigenvalues. Methods based on first determining the coefficients of the characteristic polynomial, and later to determining the eigenvalue solving algebraic equations are rarely implemented, because they are numerically unstable. In fact, for the coefficients of the characteristic polynomial burdened with rounding errors, and due to numerical instability cause large errors in the eigenvalue. Because of that, the characteristic polynomial has mainly theoretical significance. The methods, which are based on the direct application characteristic polynomial, are applied in practice only when the characteristic polynomial is well conditioned. Also for some structured matrices, we can apply the method for the characteristic polynomial, but we don’t calculate directly characteristic polynomial coefficients. The following example describes a class of such matrices.

Example 2.1.1 The Example of structured matrix which achieve the characteristic polynomial for determining the eigenvalue are Toeplitz matrix. Toeplitz matrix marked as T_n, are matrices with constant diagonals. If the Toeplitz matrix is symmetric and positive definite, recursive relation is p_n(λ) = p_n − 1(λ)β_n − 1(λ), where are p_n i p_n − 1 characteristic polynomial matrix T_n i T_n − 1 respectively a β_n − 1 Schur-Szegö parameter for Yule-Walker system. The above recursive relation enables work with characteristic polynomial without individual accounts of his odds. More information can be found at [1]

The following definitions are introducing two important terms: the geometric multiplicity of λ₀ and the algebraic multiplicity of A.

The eigenvectors corresponding to λ are the nonzero vectors in the solutions space of (A − λI)x = 0. We call this solution space the eigenspace of A corresponding to λ.

Definition 2.1.1.

If λ₀ is an eigenvalue of an n × n matrix A, then the dimension of the eigenspace corresponding to λ₀ is called the geometric multiplicity of λ₀, and the number of times that λ − λ₀ appears as a factor in the characteristic polynomial of A is called the algebraic multiplicity of A.

Eigenvalue and eigenvector have some specific features, which are easy to prove.

The following are some of the obvious features of eigenvalues of matrix A and corresponding eigenvector:

Theorem 2.1.5.

If μ ≠ 0 complex number, λ is an eigenvalue of matrix A, and x ≠ 0 corresponding eigenvector, then μx is a corresponding eigenvector.
If k is a positive integer, λ is an eigenvalue of matrix A, and x ≠ 0 corresponding eigenvector, then λ^k is an eigenvalue of A^k and x is a corresponding eigenvector.
Matrix A and A^T have the same eigenvalues.

In linear algebra invertible matrix are important. From the problem of eigenvalues we can easily conclude If the matrix A is invertible or not. What more can be, the eigenvalues of the matrix A invertible can be immediately read from the eigenvalues of the matrix A^− 1. Because of that, in the following theorems we summarize some properties of invertible matrix.

Theorem 2.1.6.

If A is an n × n matrix, then the following are equivalent.

A is invertible.
λ = 0 is not an eigenvalue of A
If λ is an eigenvalue of matrix invertible A, and x ≠ 0 corresponding eigenvector, then 1λ is an eigenvalue of A^− 1 and x is a corresponding eigenvector.
det(A) ≠ 0.
A has rank n.
Ax = 0 has only the trivial solution.
Ax = b has exactly one solution for every n × 1 matrix B
The column vectors of A form a basis for ℝⁿ.
The row vectors of A form a basis for ℝⁿ.
A^TA is invertible.

The problem of finding a base ℝⁿ consisting of eigenvectors is very important in linear algebra. Because of that, in this section we will consider the following two equivalent problems:

The Eigenvector Problem. Given an n × n matrix A, does there exist a basis for ℝⁿ consisting of eigenvectors of A?

The Diagonalization Problem (Matrix Form). Given an n × n matrix A, does there exist an invertible matrix P such that P^− 1AP is a diagonal matrix?

The latter problem suggests the following terminology.

Definition 2.1.3. Two square matrices A and B are similarly called, if there is invertible matrix P, so that B = P^− 1AP. The transition from A to the matrix of the matrix B is called the similarity transformations .

The importance of similar matrices can be seen in the following theorem:

Theorem 2.1.7. Similar matrices A and B have the same characteristic polynomial. They have the same eigenvalues including their geometric multiplicity of the geometric multiplicity of λ₀ and the algebraic multiplicity of λ₀.

Based on previous definitions, we can define the term diagonalizable as follows:

Definition 2.1.4. A square matrix A is called diagonalizable if the transformation of similarity may be translated into a diagonal form.

To the above two problems are obviously equivalent to the following theorem.

Theorem 2.1.8.

If A is an n × n matrix, then the following are equivalent.

A is diagonalizable.
A has n linearly independent eigenvectors.

The following algorithm is for diagonalizing a matrix

Algorithm for Diagonalizing a Matrix

Find n linearly independent eigenvectors of A, marked as x¹, x², ⋯, xⁿ.

Form the matrix P having x¹, x², ⋯, xⁿ., at its column vectors.

The matrix P^− 1AP then will be diagonal with λ₁, λ₂, ⋯, λ_n successive diagonal entries, where λ_i is eigenvalue corresponding to xⁱ , for i = 1, 2, ⋯, n.

Theorem 2.1.9.

If x¹, x², ⋯, x^k are eigenvectors of A corresponding to distinct eigenvalues λ₁, λ₂, ⋯, λ_k, then {x¹, x², ⋯, x^k } is a linearly independent set.

As consequence of Theorem 2.1.8. and Theorem 2.1.9 ., we obtain the following important result

Theorem 2.1.10.

If an n × n matrix A has n distinct eigenvalues, then A is diagonalizable.

There are matrices that can have the same eigenvalues and yet can be diagonalizable. Broadest such class of such matrices are normal matrix, which we will introduce the following definition.

Definition 2.1.5 . Matrix A ∈ ℂ^(n,n) is called normal, if holds A^HA = AA^H.

More general characterization of diagonalizable matrix A is given in the following theorem .

Theorem 2.1.11.

If A is a square matrix, then:

For every eigenvalue of A the geometric multiplicity is less than or equal to the algebraic multiplicity.
A is diagonalizable if and if the geometric multiplicity is equal to the algebraic multiplicity for every eigenvalue.

2.2. QR algorithm

In this section we will present the QR algorithm. In numerical linear algebra, the QR algorithm is an eigenvalue algorithm : that is, a procedure to calculate the eigenvalues and eigenvectors of a matrix. QR algorithm represent factorization method, because it is based on the matrix decomposition. First factorization method labeled as LR algorithm is developed by H. Rutishauser in 1958. LR algorithm is due to the many shortcomings, today is rarely used, (Wilkinson monograph). Better factorization method, is designated as the QR algorithm. The basic form have developed independently from each other, in 1962 G. F. Francis (England) and by Vera N. Kublanskoovskaya (USSR). Today, it is the best method for solving the unsymmetrical eigenvalue problems, when you want to determine all the eigenvalues of the matrix. It is particularly effective when it is brought into the so-called matrix "Condensed form". Condensed form is for the unsymmetrical problem Hessenberg form. About Hessenberg form will be discussed more later. The basic idea is to perform a QR decomposition, writing the matrix as a product of an orthogonal matrix and an upper triangular matrix, multiply the factors in the reverse order, and iterate.

For understanding of the QR algorithm we will need the following terms:

Definition 2.2.1. Matrix Q ∈ P^(n,n) called orthogonal if worth Q^TQ = I.

Remark 2.2.1.Orthogonal matrix represent special case of unitary matrices. The Matrix U ∈ ℂ^(n,n) called unitary if applies U^HU = I. It is now clear that the orthogonal matrix of a unitary matrix in which all elements of real numbers. In practice, of course, is easier to work with the orthogonal than unitary matrices.

Remark 2.2.2 Orthogonal and unitary matrices are normal matrices.

Let A and B are similar matrices. If the similarity transformations performed by the orthogonal or unitary matrix Q i.e. if applies B = Q^TAQ or B = U^HAU we will say that the matrices A and B are unitary similar. Since the unitary similar matrices are special special case of similar matrix, the eigenvalues of unitary similar matrices are the same.

In addition to the unitary similar matrices and their properties for the introduction of the QR algorithm, we will need the following theorem.

Theorem 2.2.1. Let’s A ∈ ℝ^(n,n) regular matrix, then there is a decomposition A = QR, where is Q orthogonal matrix and R upper triangular matrix. If the diagonal elements of the matrix R are positive, decomposition is unique.

The decomposition of the matrix A from Theorem 2.2.1 is called the QR decomposition of the matrix A.

The following is a basic form of the QR algorithm.

Let A₀ := A. The basic forms of the QR algorithm is given by

Algorithm 2.2.1 . (QR algorithm-basic forms)

For i = 0, 1, ⋯ until convergence

Decompose A_i = Q_iR_i (QR decomposition)

Ai+1=RiQiE60000

End

Theorem 2.2.1 All matrices A_i resulting in algorithms 2.2.1 are unitary similar.

Proof

Since applies QiTAi=Ri we have Ai+1=RiQi=QiTAiQi. Based on past relationships, it is clear that it is true Ai+1=QiTQi−1T⋅⋯⋅Q0TA0Q0⋅⋯⋅Qi−1Qi. It is obvious that the matrix Q := Q₀ ⋅ ⋯ ⋅ Q_i orthogonal matrix and the theorem is proved.

Let’s look briefly at some properties QR algorithm:

For QR factorization full matrix it is necessary O(n³) flops by factorization. Total QR iteration needs O(n⁴) flops, which means that it is objectively very slow. But the problem of the slowness of the basic forms of the algorithm can be overcome with the two strategies, namely: translation matrix in Hessenberg form and the introduction of shift.

Let A ∈ ℝ^(n,n) then for each and Q_i, R_i ∈ ℝ^(n,n) and the whole algorithm is performed in real area. If all eigenvalues of matrix A modulo different, then they are all real and the algorithm converges. However, if there is at least one, and so it is worth λi+1λi≈1 QR algorithm is very slow. If the matrix A has complex eigenvalues then a basic form of the QR algorithm does not converge. We have already seen the need for the introduction of Hessenberg form of a matrix and it will introduce the following definition.

Definition 2.2.2. For the matrix A will say that in the upper Hessenberg form if holds

aij=0fori−j>2E70000

i.e. the upper triangular matrix has a side below the main diagonal.

The process of reducing the Hessenberg form can be performed using Househelder reflectors and Givens rotation. Let’s look at bringing in Hessenberg form using Househelder spotlight.

Let A=a11cTbBwithb≠0

Our goal is to determine ω ∈ ℂ^n − 1 with features a‖ω‖₂ = 1 and

Q1b:=In−1−2ωωHb=ke1E80000

where I_n − 1 is identity matrix row n-1 a e¹ the first column of the matrix I_n − 1.

We define Householder reflector as follows

P1:=10T0Q1E90000

Now applies

A1:=P1AP1=a11cTQ1k0⋮Q1BQ10.E100000

Obviously, the first column of the matrix A₁ have a look what is required in the upper Hessenberg form. In this way, we showed that the first column is converted into a suitable form. An analogous procedure can be performed on 2,3,…,n-1 column.

For the implementation of QR algorithm is important that in each iteration in implementing QR algorithm preserves the structure of the matrix. For a matrix form in the upper Hessenberg applies following theorem.

Theorem 2.2.2 If A is upper Hessenberg matrix, then the Q in its QR factorization A=QR also upper Hessenberg matrix.

The above theorem that QR and also upper Hessenberg matrix is the product of the upper triangular matrix and upper Hessenberg matrix.

Preserving the structure of the matrix is very important to stop the efficiency of the algorithm. Namely, if the upper Hessenberg matrix A matrix of its QR factorization is only O(n²) instead of O(n³) as is necessary for QR factorization (decomposition) of full matrix.

Let’s look at one more advantage of the transformation matrix in Hessenberg form. Namely, A_i → R (i → ∞), where R is an upper triangular matrix. Because the matrix A_i is always upper Hessenberg matrix proves that elements of the second diagonal tend to zero and it’s worth aj+1,ji→0i→∞,j=1,2,⋯,n−1 wherein aj+1,ji elements of the matrix A_i. It is now clear that for sufficiently large and can be read eigenvalues initial value of A on the basis of theorem 2.1.3. as the diagonal elements of the matrix A_i

For a further improve of the algorithm, shift is used . The idea of improving the QR algorithm is based on the simple fact that if the eigenvalues of A are equal λ_i that the eigenvalues of the matrix A − σI equal λ_i − σ. If we shift σ chosen close eigenvalues, there is a strong acceleration of the algorithm.

Let A₀ := A.

Algorithm 2.2.2. . (QR algorithm with shift)

For i = 0, 1, ⋯ until convergence

Choose shift σ_i near the eigenvalues value

Decompose A_i − σ_iI = Q_iR_i (QR decomposition)

Ai+1=RiQi+σiIE110000

End

It is easy to prove that all matrices in the algorithm 2.2.1. are unitary similar. From the above it is clear that in the case of real matrices with real eigenvalues is best to shift the parameters taken σi=an.ni.

For a further analysis of the parameter shifts we will need the concept of reduced upper Hessenberg matrix as well as the implicit Q theorem.

Definition 2.2.3 The upper Hessenberg matrix H is unreduced upper Hessenberg matrix if the first second diagonal not a single zero.

Theorem 2.2.3. Let Q^TAQ = H unreduced upper Hessenberg matrix with positive subdiagonal elements h_k + 1,k a Q unitary matrix. The columns of the matrix Q and matrix H starting from the second to n-th, are uniquely determined first column of the matrix Q.

Proof

Let Q = (q¹, q², ⋯, qⁿ) some prior q¹, q², ⋯, q^k and the first k − 1 column of the matrix H is determined. The proof will be carried out by mathematical induction by k. For k = 1 is determined q¹ and the process can start. Because QH = AQ and H = (h_ij) the upper Hessenberg matrix applies

hk+1,kqk+1+hkkqk+⋯+h1kq1=Aqk.E120000

If you multiply the last equality with (qⁱ)^H we get

hik=qiHAqki=1,2,⋯,kE130000

From here it’s k-th column of H except element h_k + 1,k specified.

Because h_k + 1,k ≠ 0 we have

qk+1=1hk+1,kAqk−∑i=1khikqi.E140000

From (q^k + 1)^Hq^k + 1 = 1 and positivity h_k + 1,k we get h_k + 1,k in a unitary way.

Theorem is proved.

Remark 2.2.3 Condition h_k + 1,k > 0 in the previous theorem was we only need to ensure uniformity of matrices Q i H.

With the help of implicit Q theorem we discuss the selection of shift A = A₀ real matrix has complex eigenvalues. Then you have to make a double shift for σiσ¯. Namely,

A0−σI=Q1R1,A1=R1Q1+σIA0−σ¯I=Q1R1,A1=R1Q1+σ¯I.E150000

From there, easy to get A2=Q2TQ1TA0Q1Q2. Matrices Q₁i Q₂ can be chosen so that Q₁Q₂ real matrices and therefore the matrix A₂ is real matrix. Applying

Q1Q2R2R1=Q1A1−σ¯IR1=Q1R1Q1+σ−σ¯IR1=Q1R1Q1R1+σ−σ¯Q1R1=A0−σI2+σ−σ¯A0−σI=A02+σ+σ¯A0+σ2I=:M.E160000

Because of σ+σ¯∈P, matrix M is real. Then Q₁Q₂R₂R₁ QR factorization real matrix and that means Q₁Q₂ i R₂R₁ we can choose a real matrix. The first column of the matrix Q₁Q₂ is proportional to the first column of the matrix M, and the other columns are calculated by applying implicit Q theorem.

2.3. Mathematical background for Hermitian (symmetric) case

In this section we look at the problem of eigenvalues in the case of a symmetric or Hermitian matrix.

Definition 2.3.1. Matrix A ∈ ℝ^(n,n) is called symmetric if applies A = A^T.

Definition 2.3.2 Matrix A ∈ ℂ^(n,n) is called Hermitian if applies A = A^H.

Remark 2.3.1 Symmetric Matrices are only a special case of Hermitian matrices in the case that the elements matrices are real numbers. Therefore, we will formulate a theorem for Hermitian matrices.

Remark 2.3.2 Hermitian and symmetric matrices are normal matrices, which means that they can diagonalize.

The following theorem gives important information on the reality of eigenvalues Hermitian (symmetric) matrices. This feature greatly facilitates consideration of the problem of eigenvalues for this class of matrices, which makes this class of matrices applicable in practice.

Theorem 2.3.1 . If A is Hermitian (symmetric) matrix, then:

The eigenvalues of A are all real numbers.
Eigenvectors from different eigenspace are orthogonal.

Since all the eigenvalues of real ones can be compared. Therefore, we assume that the eigenvalues in order of size, i.e. It applies λ₁ ≤ λ₂ ≤ ⋯ ≤ λ_n and that the corresponding orthonormal eigenvectors.

If the matrix A is symmetrical due to symmetry feature, it comes to the significant acceleration algorithms for the unsymmetrical case. We will demonstrate the QR algorithm which is presented in section 2.2. for the unsymmetrical case. In symmetric case is important to note that the upper Hessenberg form of symmetric matrix tridiagonal matrix whose QR decomposition is only necessary O(n) operation. It is also important that during the QR algorithm preserves the structure of the matrix or all tridiagonal matrices A_i. For a shift in this case is usually taken Wilkins shift which is defined as an eigenvalue of matrix

a n − 1 , n − 1 a n − 1 , n a n − 1 , n a n , n , that is closest a_n,n.

For a QR algorithm Wilkinson shifts apply following theorem, whose proof is given in [2]

Theorem 2.3.2 (Wilkinson) QR algorithm with Wilkinson shifts for symmetric tridiagonal matrix converges globally and at least linearly. For a almost all of the matrices are on asymptotically cubic converging.

Now we introduce the very important concept of the Rayleigh quotient, because it gives the best estimate of the eigenvalues for a given vector x ∈ ℂⁿ, x ≠ 0

Definition 2.3.3 Let A be a Hermitian (symmetric) matrix. For a given vector x ∈ ℂⁿ, x ≠ 0 Rayleigh quotient is defined Rx:=xHAxxHx.

The importance of the Rayleigh quotient is seen in the following theorems

Theorem 2.3.3. (Features Rayleigh quotient)

For all x ∈ ℂⁿ, x ≠ 0 worth λ₁ ≤ R(x) ≤ λ_n.
λ1=minx≠0Rx,λn=maxx≠0Rx
If x ≠ 0 with λ₁ = R(x) respectively λ_n = R(x) x is then x is the eigenvector corresponding to λ₁ respectively λ_n.
λi=minRx:xHxj=0,j=1,⋯,i−1,x≠0=maxRx:xHxj=0,j=i+1,⋯n,x≠0

Paragraph (d) in the previous theorem is known as Rayleigh principle. However it is numerically worthless, because for the determination example, we need eigenvector or x¹ corresponding to the eigenvalue λ₁. In order to overcome this disadvantage is introduced min max principle of Poincaré that is listed in the following theorems.

Theorem 2.3.4 . (min max principle of Poincaré )

λi=mindimV=imaxx∈V\0Rx=maxdimV=n−i+1minx∈V\0RxE190000

The following formulation is known as min max principle of Courant-Fischer and often favorable to use.

Theorem 2.3.5. (min max principle of Courant-Fischer )

λi=minp1⋯pi−1maxRx:xHpj=0,j=i+1,⋯n,x≠0maxp1⋯pn−iminRx:xHpj=0,j=i+1,⋯n,x≠0.E200000

From the above it is clear that these theorems are important for the localization of eigenvalues.

The following is an algorithm that is in linear algebra known as Rayleigh quotient iteration and it reads as follows

Let A ∈ ℝ^(n,n) symmetric matrix and x⁰ initial vector, which is the standardized. For which applies ‖x⁰‖₂ = 1.

Algorithm 2.3.1 . (Rayleigh quotient iteration)

σ0=x0TAx0E210000

For i = 1, 2 ⋯ until convergence

Solve (A_i − σ_iI)yⁱ = x^i − 1

xi=yiyi2E220000

σ0=xiTAxi.E230000

End

Theorem 2.3.6. Rayleigh quotient iteration converges cubic.

Finally point out that the most effective for symmetric matrices Divide-and-Conquer method. This method was introduced by Cupen [3] and the first effective implementation is the work of Cu and Eisenstat [4]. About this method, more information can be found in [4].

2.4. Physical background

We have already mentioned that the problem of eigenvalues has numerous applications in engineering. Even more motivation to consider problems of eigenvalues, comes from their large application in technical disciplines. On a simple example of a mass-spring system illustrated with application of eigenvalues in engineering. We are assuming, that each spring has the same natural length l and the same spring constant k. Finally, we can assume that the displacement of each spring in measured to its own local coordinate system with an origin at the spring’s equilibrium position.

Applying Newton’s second law we get the following system

m1d2x1dt2−k−2x1+x2=0m2d2x2dt2−kx1−2x2=0.E240000

We are aware of vibration theory that

xi=aisinωt,E250000

where a_i is the amplitude of the vibration of mass i and ω is the frequency of the vibration.

If the last equation twice differentiate by t, we get

d2x2dt2=−aiω2sinωt.E260000

If the last two equations obtained expressions for xiid2xidt2i=1,2 replace the initial system and write the system in matrix form, then we get

Aa1a2=ω2a1a2,E6

where

A=2km1−km1−km22km2E280000

Equation (6) represents unsymmetrical eigenvalue problem.

More information on this case can be found in [5]

2.5. General Linear Eigenvalue Problem

In this section we will deal with general linear eigenvalue problem or the problem

Ax=Bx,x≠0,E7

where A, B ∈ C^(n,n)

The scalar λ is called an eigenvalue of the problem (7), and x said to be an eigenvector of (7) corresponding to λ.

A common acronym for general linear eigenvalue problem is GEP. Now eigenvalue problems previously discussed is called the standard eigenvalue problem and tagging with SEP. In practice, the more often we meet with GEP than SEP. Now let’s consider some features of GEP and establish its relationship with SEP.

It is obvious that the eigenvalues of (7) zero of the characteristic polynomial, which is defined as p_n(λ) := det(A − λB). In the case of GES degree polynomial p_n is less than or equal to n. The characteristic polynomial of degree n has pn if and only if B is regular matrix. In case B=I we get SEP and in this case SEP has n eigenvalues. GEP and can be less than n eigenvalues. It can also happen that at GEP is p_n(λ)≡0 and it that case GEP has infinitely many eigenvalues.

The following two examples illustrate this situation with GEP

Example 4.5.1 Let in GEP A=0000,B=0600. Then worth p_n(λ)≡0 and every λ ∈ ℂ is eigenvalue i x=10 is the corresponding eigenvector.

Example 4.5.2 Let in GEP A=1234,B=1−1−11. When we get to the characteristic polynomial p_n(λ) = (1 − λ)(4 − λ) − (3 + λ)(2 + λ) = − 2 − 10λ and the only eigenvalue value is λ=−15.

Atypical nature of we met in the last two examples are the result of the fact that in their matrix B was not regular. Therefore, it is usual in the GEP taken that A and B are Hermitian matrix and B is positive definite, that is, all the eigenvalues of the matrix B are positive.

Our goal is to find a connection between this taken GEP and symmetric SEP. As we said B is positive definite, B has a Cholesky decomposition, i.e., B = CC^H. Then is

Ax=λBx⇔Fy:=C−1AC−Hy,y:=CHxE8

Since matrix F has n eigenvalues and it belongs to the GEP also has n real eigenvalues. Namely on the basis of (7) are the same eigenvalues.

Let yⁱ i y^j orthonormal eigenvectors of F then for xⁱ = C^− Hyⁱ i x^j = C^− Hy^j apply

δij=yiHyj=CHxiHCHxj=xiHCCHxj=xiHBxjE9

From equation (8) it is clear that the eigenvectors GEP (6) are an orthonormal vectors in relation to the new inner product is defined as ⟨x, y⟩_B := x^HBy. Now we’re going to GEP (7) redefine Rayleigh quotient as follows R ( A,B ) ( x ):= x H Ax x H Bx . With this predefined inneren product and Rayleigh quotient applies to all theorems of section 2.3 with appropriate modification of the definition changes required. This gives self-generation Hermitian (symmetrical) SEP.

3. The quadratic eigenvalue problems

In practice, nonlinear eigenproblems commonly arise in dynamic/stability analysis of structures and in fluid mechanics, electronic behavior of semiconductor hetero-structures, vibration of fluid–solid structures, vibration of sandwich plates, accelerator design, vibro-acoustics of piezoelectric/poroelestic structures, nonlinear integrated optics, regularization on total least squares problems and stability of delay differential equations. In practice, the most important is the quadratic problem Q(λ)x = 0 where

Qλ:=λ2A+λB+C,A,B,C∈ℂn×n,A≠0,x≠0E10

This section is organized as follows:

Basic Properties
We consider Rayleigh functional and Minmax Characterization
Linearization
A standard approach for investigating or numerically solving quadratic eigenvalue linearization problems, where the original problem is transformed into a generalized linear eigenvalues problem with the same spectrum.
Physical Background
We study vibration analysis of structural systems

3.1. Basic Properties

Variational characterization is important for finding eigenvalues. In this section we give a brief review of variational characterization of nonlinear eigenvalue problems. Since the quadratic eigenproblems are a special case of nonlinear eigenvalue problems, results for nonlinear eigenvalue problems can be specially applied for the quadratic eigenvalue problems. Variational characterization is generalization of well known minmax characterization for the linear eigenvalue problems.

We consider nonlinear eigenvalue problems

Tλx=0,E11

where T(λ) ∈ ℂ^n × n, λ ∈ J is a family of the Hermitian matrices depending continuously on the parameter λ ∈ J and J is a real open interval which may be unbounded.

Problems of this type arise in damped vibrations of structures, conservative gyroscopic systems, lateral buckling problems, problems with retarded arguments, fluid–solid vibrations, and quantum dot heterostructures.

To generalize the variational characterization of eigenvalues we need a generalization of the Rayleigh quotient. To this end we assume that

(A) for every fixed x ∈ ℂⁿ, x ≠ 0 the scalar real equation

fλ;x:=xHTλxE12

has at most one solution p(x) ∈ J. Then f(λ; x) = 0 implicitly defines a functional p on some subset D ⊂ ℂⁿ which is called the Rayleigh functional of (11).

(B) for every x ∈ D and every λ ∈ J with λ ≠ p(x) it holds that (λ − p(x))f(λ; x) > 0.

If p is defined on D = ℂⁿ\{0} then the problem (1) is called overdamped, otherwise it is called nonoverdamped.

Generalizations of the minmax and the maxmin characterizations of the eigenvalues were proved by Duffin [6] for the quadratic case and by Rogers [7] for the general overdamped problems. For the nonoverdamped eigenproblems the natural ordering to call the smallest eigenvalue the first one, the second smallest the second one, etc., is not appropriate. The next theorem is proved in [8], which gives more information about the following minmax characterization for eigenvalues.

Theorem 3.1.1.

Let J be an open interval in ℝ, and let λ) ∈ ℂ^nxn, λ ∈ J, be a family of Hermitian matrices depending continuously on the parameter λ ∈ J, such that the conditions (A) and (B) are satisfied. Then the following statements hold.

For every l ∈ ℕ there is at most one lth eigenvalue of T(⋅) which can be characterized by
λl=minV∈Hl,V∩D≠∅supv∈V∩DpvE13
If
λl:=infV∈Hl,V∩D≠∅supv∈V∩Dpv∈J.E360000
For some l ∈ ℕ then λ _l is the lth eigenvalue of ,T(⋅) in J, and (13) holds.
If there exists the kth and lth eigenvalue λ _k and λ _l in J(k < l), then J contains the jth eigenvalue λ _j(k ≤ j ≤ l) as well, and λ _k ≤ λ _j ≤ λ _l.
Let λ ₁ = inf_x ∈ Dp(x) ∈ J and λ _l ∈ J. If the minimum in (13) is attained for an l dimensional subspace V, then V ⊃ D ∪ {0}, and (3) can be replaced with
λl=minV∈Hl,V⊃D∪0supv∈V,v≠0pv.E370000
λ˜ is an lth eigenvalue if and only if μ = 0 is the lth eigenvalue of the linear eigenproblem T(λ)x = μx.
The minimum in (3) is attained for the invariant subspace of T(λ_l) corresponding to its lth largest eigenvalues.

Sylvester’s law of inertia has an important role in the nonlinear eigenvalue problems. We will briefly look back to the Sylvester’s law of inertia. With this purpose we define the inertia of the Hermitian matrix T as follows [9].

Definition 3.1.1. The inertia of a Hermitian matrix T is the triplet of nonnegative integers In(T) = (n_p, n_n, n_z) where n_p, n_n and n_z are the number of positive, negative, and zero eigenvalues of T (counting multiplicities).

Next, we consider a case that an extreme eigenvalue λ₁ := inf_x ∈ Dp(x)orλ_n := sup_x ∈ Dp(x) is contained in J.

Theorem 3.1.2 Assume that T : J → ℂ^nxn satisfies the conditions of the minmax characterization, and let (n_p, n_n, n_z) be the inertia of T(σ) for some σ ∈ J.

If λ₁ := inf_x ∈ Dp(x) ∈ J, then the nonlinear eigenproblem T(λ)x = 0 has exactly n_p eigenvalues λ1≤⋯≤λnp in J which are less than σ.
If sup_x ∈ Dp(x) ∈ J, then the nonlinear eigenproblem T(λ)x = 0 has exactly n_n eigenvalues λ n− n n +1 ≤⋯≤ λ n in J exceeding σ.

We consider the quadratic eigenvalue problem in the label QEP .That is why we adapt to the real scalar equation (11) QEPu. In this way, we get

fλx:=λ2xHAx+λxHBx+xHCx=0foreveryfixedx∈ℂn,x≠0E14

Natural candidates for the Rayleigh functionals of QEPa (Eq. 10) are

p+x:=−xHBx2xHAx+xHBx2xHAx2−xHCxxHAxandE15

p−x:=−xHBx2xHAx−xHBx2xHAx2−xHCxxHAxE16

The Rayleigh functionals are the generalization of the Rayleigh quotient.

In this section we deal with the hyperbolic quadratic pencil as an example overdamped problems and gyroscopically stabilized pencil as an example of the changes that are not.

Now, let us look briefly the hyperbolic quadratic pencil. It is overdamped square pencil given by (10) in which the A = A^H > 0, B = B^H, C = C^H. For hyperbolic quadratic pencil, the following interesting features

The ranges J₊ := p₊(ℂⁿ\{0}) and J₋ := p₋(ℂⁿ\{0}) are disjoint real intervals with max J₋ < min J₊. Q(λ) is the positive definite for λ < min J₋ and λ > max J₊, and it is the negative definite for λ ∈ (max J₋, min J₊).

(Q, J₊) and (−Q, J₋) satisfy the conditions of the variational characterization of the eigenvalues, i.e. there exist 2n eigenvalues[1].

λ1≤λ2≤⋯≤λn<λn+1≤⋯≤λ2nE17

and

λj=mindimV=jmaxx∈V,x≠0p−x,λn+j=mindimV=jmaxx∈V,x≠0p+x,j=1,2,⋯,n.E18

Now we will look at gyroscopically stabilized system in the label GSS. A quadratic polynomial matrix

Qλ=λ2I+λB+C,B=BH,detB≠0,C=CHE19

is gyroscopically stabilized if for some k>0 it holds that

B>kI+k−1C,E20

where denotes the positive square root of B².

Definition 3.1.2. A eigenvalue λ is positive type if applies xHQ′λx>0x∈ℂn,x≠0

A eigenvalue λ is positive type if applies xHQ′λx<0x∈ℂn,x≠0

Theorem (Barkwell, Lancaster, Markus 1992)

The spectrum of a gyroscopic stabilized pencil is real, i.e. Q is quasi-hyperbolic.
All eigenvalues are either of positive type or of negative type
If (n_p, n_n, n_z) is the inertia of B, then Q(λ)x = 0 has 2n_p negative and 2 n_n positive eigenvalues.
The 2n_p negative eigenvalues lie in two disjoint intervals, eigenvalues in each; the ones in the left interval are of negative type, the ones in the right interval are of positive type.
The 2 n_n negative eigenvalues lie in two disjoint intervals, eigenvalues in each; the ones in the left interval are of negative type, the ones in the right interval are of positive type.

Without loss of generality we will observe only positive eigenvalues value

Let now p+x:=−xHBx2xHx+xHBx2xHx2−xHCxxHx and p−x:=−xHBx2xHx−xHBx2xHx2−xHCxxHx functionals appropriate for GSS. With them, we can define the Rayleigh functionals p−+:=p−xzap−x>00else i p++:=p+xzap−x>00else

Voss and Kostić are defined for this function given interval in which the eigenvalues can minmax characterize.

In order to minmax characterized all eigenvalues, we will introduce new Rayleigh functional. It is a new strategy. With this aim we matrices B and C as well as vector x present in the following format :

B=B100−B2,Bi>0,i=1,2C=C11C12C12HC22,Cii>0,i=1,2x=zy≠0.E450000

We define

Q1λ:=λ2I+λB1+C11Q2λ:=λ2I−λB2+C22Tλ:=Q2λ−C12HQ1λ−1C12E460000

Because of the conditions (20) are Q₁(λ) I Q₂(λ) are hyperbolic.

We will observe the following problem inherent value

Tλy=0,y≠0,y∈ℂn−np,y≠0E470000

Applies following theorem

Theorem 3.1.3 λ is eigenvalue Q(⋅) if and only if λ is eigenvalue T(⋅)

Proof

Q λ x = 0 ⇔ Q 1 λ C 12 C 12 H Q 2 λ x = 0 ⇔ Q 1 λ C 12 C 12 H Q 2 λ z y = 0 ⇔ Q 1 λ z + C 12 y C 12 H z + Q 2 λ y = 0 0 ⇔ Q 1 λ z + C 12 y = 0 ∧ C 12 H z + Q 2 λ y = 0 ⇔ z = − Q 1 λ − 1 y ∧ Q 2 λ − C 12 H Q 1 λ − 1 C 12 y = 0 ⇔ z = − Q 1 λ − 1 y ∧ T λ y = 0 . E480000

Theorem is proved.

Analogous to the (14) we defined the following functions

qλ;y:=yHTλyandf2λ;y:=yHQ2λyE490000

In the following theorem we give information about the properties q(λ; y)

Theorem 3.1.4. Function q(λ; y) has the following characteristics

For each vector q(0; y) > 0
For λ>0 a function q(λ; y) exactly two zeros for each vector y. Minimum zero function q(λ; y) and is lower than the minimum zero function f₂(λ; y) and the largest zero function q(λ; y) is greater than the maximum zero function f₂(λ; y)
From f2′λy>0iλ>0 follow q′(λ; y)

Proof

Because C > 0 we have
− y H C 12 H C 11 − 1 y H ⋅ C 11 C 12 C 12 H C 22 ⋅ − C 11 − 1 C 12 y y > 0 . E500000
It follows
−yHC12HC11−1yH⋅0C22y−C12HC11−1C12y>0.E510000
then
q0;y=yHC22−C12HC11−1C12y>0E520000
We have already mentioned that Q₂(λ) is hyperbolic. This means that the function f₂(λ; y) for each vector y has two different real roots. For λ > 0 ist Q₁(λ) > 0. Therefore, applies
qλ;y<yHQ2λy−yHC12HQ1λ−1C12y⏟>0E530000
<yHQ2λy=f2λyE21
Of further
limλ→+∞qλ;y=+∞E22
Of (a) and because (21) and (22) follows that the function q(λ; y) for λ>0 has two zero. Minimum zero function q(λ; y) a and is lower than the minimum zero function f₂(λ; y) and the largest zero function q(λ; y) b is greater than the maximum zero function f₂(λ; y)
q ′ ( λ;y )=2λ y H y− y H B 2 y+ y H C 12 H ( Q 1 ( λ ) ) −1 C 12 y ︸ >0
Theorem is proved.
Define a new functional

Definition 3.1.3. Let q(a, y) = q(b, y) = 0 and 0 < a < b. We define two new functional

t−y=at+y=bW±:=t±hE560000

Theorem 3.1.5. Applies maxW₋ < minW₊

Proof

p 2 ± x : = − x H B 2 x 2 x H x ± x H B 2 x 2 x H x 2 − x H C 2 x x H x J ± : = p 2 ± ℂ n − n p \ 0 . max J − < min J + , E570000

Because Q₂(λ) is hyperbolic.

t−y<maxJ−<minJ+<t+y,foreveryy∈ℂn−np\0E580000

⇒maxW−<maxJ−<minJ+<maxW+E590000

Theorem is proved.

Theorem 3.1.6. All the positive eigenvalues of (19) are either the value of the MinMax of t₋(y) or Maxmin of t₊(y)

3.2. Linearization

In this section we will deal with linearization. As mentioned linearization is standard procedure for reducing QEP on GEP with a view to facilitate the computation of eigenvalues. We have already seen that the problem of eigenvalues usually come as a result of solving differential equations or systems of differential equations. That is the basic idea of linearization came in the field of differential equations where the order of the differential equation of the second order can be lowered by introducing a system of two partial differential equations of the first order with two unknown functions.

The basic idea of linearization in QEPa is the introduction of shift z = λx u

λ2A+λB+Cx=0.E600000

Then we get

λAz + Bz + Cx = 0 or
λAz + λBx + Cx = 0.

The resulting equations are GEP because they can be written respectively in the form of

−B−CIOzx=λAOOIzx
OCIOzx=λABOIzx.

Since the corresponding GEPs all matrices 2n × 2n to GEP has 2n eigenvalue and therefore appropriate QEP has also 2n eigenvalue. From the above it is clear that linearization is not unambiguous. However, in the choice of linearization QEP is an important factor to maintain symmetry and some spectral properties QEP if it is possible. The application of linearization will be will in the next chapter.

3.3. Physical Background

We look now at the application of eigenvalues of quadratic problems in engineering. The largest review of applications QEP is in the [10].We have already mentioned in the introduction to eigenvalue problem arises in connection with differential equations or systems of differential equations. In structural mechanics the most commonly are used differential equations and therefore the problem of eigenvalues. Note that the ultimate goal is to determine the effect of vibrations on the performance and reliability of the system, and to control these effects.

We will now demonstrate the linearization of QEP on a concrete example from the engineering. Low vibration system on n unknowns are described by the following system of differential equations

My¨+Cy.+Ky=0,E23

where M is mass matrix, C is viscous damping matrix, and K is the stiffness matrix thus Because of the conditions in physics M and K are related to the kinetic and strain energy, respectively, by a quadratic form which makes them symmetric. For most structures M and K are positive definite and space.

The introduction of shift y = xe^λ after rearrangement, we get

λ2M+λC+Kxeλ=0E620000

Respectively

λ2M+λC+Kx=0E24

Therefore, the system (23) has a nontrivial solution y is selected, such that λ that QEP (24) has a nontrivial solution x.

Now we’re going to QED (24) apply linearization method presented in section 3.2. Thus we have the appropriate GEP

−C−KIOzx=λMOOIzx.E640000

When the system is undamped (C=O) we get

ωMx:=λ2Mx=Kx=0.E650000

Because the most common matrix M and K are symmetric, positive definite and space obtained GEP is easy to solve.

4. Conclusion

Because of the great practical application eigenvalue problem occupies an important place in linear algebra. In this chapter, we discussed the linear and quadratic eigenvalues. In particular, an emphasis is on numerical methods such as the QR algorithm, Rayleigh quotient iteration for linear problems of eigenvalues and linearization and minmax characterization of quadratic problems eigenvalues. The whole chapter shows that the structure of the matrix, participating in the problem of eigenvalues, strongly influence the choice of the method itself. It is also clear that using the features of the structure matrix can do much more effectively existing algorithms. Thus, further studies are going to increase of use feature matrix involved in the problem of eigenvalues, with the aim of improving the effectiveness of the method. Finally, we point out that in this chapter we introduced new Rayleigh functionals for gyroscopically stabilized system that enables complete minmax (maxmin) characterization of eigenvalues. It’s a new strategy. We have proved all relevant features of new Rayleigh functionals.

References

1. Kostić, A.Methods for the determination of some extremal eigenvalues of a symmetric Toeplitz matrix [thesis]. Hamburg: Germany TUHH
2. Parllet, B. N.: The Symmetric Eigenvalue Problem. SIAM Classicis in Appied Mathematihs 20, Philadelphia, 1998. DOI: http://dx.do.org/20.1137/1.9781611971163.fm1998,
3. Cuppen, J. M.: A divide and conquer method for the symmetric tridiagonal eigenproblem. Numer. Math. 1981; 36: 177–195. DOI: 10.1007/BFo1396757
4. Gu, M & Eisenstat A.: Divide-and-conquer algorithm for the symmetric tridiagonal eigenproblem. SIAM J. Matr. Anal. Appl. 1995; 16: 172–191. DOI:10.1137/S0895479892241287
5. Chapra, S. T & Canale R. P.: Numerical methods for Engineers. McGraw-Hill Book Company, Singapore, 1990. DOI: 456789 BJE95432
6. Duffin, R. J.: A minimax theory for overdamped networks, J. Rat. Mech. Anal., 1955; 4: 221–233.
7. Rogers, E.: A minimax theory for overdamped systems, Arch. Ration. Mech. Anal., 1964; 16: 89–96. DOI: 10.1007/BF00281333
8. Voss, H.: A minmax principle for nonlinear eigenproblems depending continuously on the eigenparameter, Numer. Lin. Algebra Appl. 2009; 16: 899–913. DOI: 10.1002/nla.670
9. Kostić, A & Voss, H.: On Sylvester’s law of inertia for nonlinear eigenvalue problems, Electr. Trans. Numer. Anal., 2013; 40: 82–93.
10. Tisseur, F & Meerbergen, K.: The quadratic eigenvalue problem, SIAM Review, 2001; 43:235–286.

[1] 1. Kostić, A.Methods for the determination of some extremal eigenvalues of a symmetric Toeplitz matrix [thesis]. Hamburg: Germany TUHH

[2] 2. Parllet, B. N.: The Symmetric Eigenvalue Problem. SIAM Classicis in Appied Mathematihs 20, Philadelphia, 1998. DOI: http://dx.do.org/20.1137/1.9781611971163.fm1998,

[3] 3. Cuppen, J. M.: A divide and conquer method for the symmetric tridiagonal eigenproblem. Numer. Math. 1981; 36: 177–195. DOI: 10.1007/BFo1396757

[4] 4. Gu, M & Eisenstat A.: Divide-and-conquer algorithm for the symmetric tridiagonal eigenproblem. SIAM J. Matr. Anal. Appl. 1995; 16: 172–191. DOI:10.1137/S0895479892241287

[5] 5. Chapra, S. T & Canale R. P.: Numerical methods for Engineers. McGraw-Hill Book Company, Singapore, 1990. DOI: 456789 BJE95432

[6] 6. Duffin, R. J.: A minimax theory for overdamped networks, J. Rat. Mech. Anal., 1955; 4: 221–233.

[7] 7. Rogers, E.: A minimax theory for overdamped systems, Arch. Ration. Mech. Anal., 1964; 16: 89–96. DOI: 10.1007/BF00281333

[8] 8. Voss, H.: A minmax principle for nonlinear eigenproblems depending continuously on the eigenparameter, Numer. Lin. Algebra Appl. 2009; 16: 899–913. DOI: 10.1002/nla.670

[9] 9. Kostić, A & Voss, H.: On Sylvester’s law of inertia for nonlinear eigenvalue problems, Electr. Trans. Numer. Anal., 2013; 40: 82–93.

[10] 10. Tisseur, F & Meerbergen, K.: The quadratic eigenvalue problem, SIAM Review, 2001; 43:235–286.

Eigenvalue Problems

Applied Linear Algebra in Action

Abstract

Keywords

Author Information

Aleksandra Kostić*

1. Introduction

2. The linear eigenvalue problem

2.1. Basic properties

2.2. QR algorithm

2.3. Mathematical background for Hermitian (symmetric) case

2.4. Physical background

2.5. General Linear Eigenvalue Problem

3. The quadratic eigenvalue problems

3.1. Basic Properties

3.2. Linearization

3.3. Physical Background

4. Conclusion

References

Nonnegative Inverse Elementary Divisors Problem

Eigenvalue Problems

Applied Linear Algebra in Action

Abstract

Keywords

Author Information

Aleksandra Kostić*

1. Introduction

2. The linear eigenvalue problem

2.1. Basic properties

2.2. QR algorithm

2.3. Mathematical background for Hermitian (symmetric) case

2.4. Physical background

2.5. General Linear Eigenvalue Problem

3. The quadratic eigenvalue problems

3.1. Basic Properties

3.2. Linearization

3.3. Physical Background

4. Conclusion

References

Continue reading from the same book

Applied Linear Algebra in Action