New Matrix Series Formulae for Matrix Exponentials and for the Solution of Linear Systems of Algebraic Equations

The solution of certain differential equations is expressed using a special type of matrix series and is directly related to the solution of general systems of algebraic equations. Efficient formulae for matrix exponentials are derived in terms of rapidly convergent series of the same type. They are essential for two new solution methods, especially beneficial for large linear systems, namely an iterative method and a method based on an exact matrix product formula. The computational complexity of these two methods is analysed, and for both of them, the number of matrix exponential-vector multiplications required for an imposed accuracy can be predetermined in terms of the system condition. The total number of arithmetic operations involved is roughly proportional to n 2 , where n is the matrix dimension. The common feature of all the series in the results presented is that starting with a first term that is already well-conditioned, each subsequent term is computed by multiplication with an even better conditioned matrix, tending quickly to the identity matrix. This contributes substantially to the stability of the numerical computation. A very efficient method based on the numerical integration of a special kind of differential equations, applicable to even ill-conditioned systems, is also presented.


Introduction
New matrix series expressions were recently derived by the author [1] for the solution of simple first order differential equations associated with general systems of linear algebraic equations. These differential equations describe the orthogonal trajectories of a family of hypersurfaces that represent a quadratic functional related to the linear algebraic system. The solution of the latter can be obtained by minimizing the functional along an orthogonal trajectory instead of applying various techniques based on minimization along conjugate gradient directions or based on minimized iterations [2]. Since the solutions of the differential equations considered are simply related to the solutions of the corresponding algebraic systems through matrix exponentials, there is the possibility to develop efficient solution methods if only the matrix exponentials could be used in numerical calculations accurately and with a small computational effort. A survey of various existent algorithms for computing matrix exponentials and a useful discussion of the difficulties involved are presented in [3].
In the present work, we use new formulae for arbitrary matrix exponentials that contain highly convergent infinite series which allow accurate and stable numerical computations. Employing these formulae, two new solution methods are proposed which are particularly efficient for large-scale general linear algebraic systems.

Differential equations associated with linear systems of algebraic equations
We start with simple vector differential equations whose solutions are related to the solution of general systems of linear algebraic equations. Later, in Section 5 we construct differential equations that allow an efficient numerical integration in order to obtain the solution of these systems.

Matrix series solution of some vector differential equations
Consider a first order vector differential equation of the form Two particular cases are presented.
The solution of (1) with f v ð Þ ¼ 1=v can also be obtained by employing a Taylor series expansion about and evaluating the derivatives yields where I is the identity matrix. The series in (10) is convergent for v ∈ 0, 2 ð Þ but its rate of convergence is, in general, very small for v very close to zero.

Relationship with linear systems of algebraic equations
Consider now a system of equations written in matrix form as ð11Þ and assume that x is a continuous function of the real variable v over a certain interval. The solution of (11) can be obtained from (3) for a v v S for which To satisfy this condition g v ð Þ in (3) must be chosen such that For positive definite matrices A and a finite g v o ð Þ this is achieved if g v ð Þ ! À∞ for v ! v S . Thus, the solution of (11) cannot be computed directly from (6). On the other hand, in the particular case of f v ð Þ ¼ 1=v, g v ð Þ ¼ ln v the solution of (11) can be obtained from (10) The expression in the brackets is just the inverse of A, The rate of convergence of the series in (14) and (15) is very small and they cannot be practically used in numerical computation for arbitrary matrices.
We note that the solution of (11) can be formally expressed from (6) [5] ð17Þ While each new term in such series as those in (10), (14) and (15) is calculated through a multiplication with a matrix that becomes more and more well-conditioned as k increases, the computation with the expression in (17) would require successive multiplications with the same original matrix and, for each k, a new polynomial is to be constructed and new Stirling numbers have to be generated. The formulae derived in the next two sections contain the same type of series and, therefore, are simpler and more efficient to be used for numerical computations.

New formulae for computing matrix exponentials
In this section, we derive matrix exponential expressions which contain highly convergent infinite series that allow accurate and stable numerical computations in numerous applications. They shall also be used in the next section.

Series expressions for matrix exponentials
Consider the matrix function v A e A ln v where v is a positive real variable and A a general square matrix with real number entries. By integration, For v ∈ 0, 2 ð Þ the integrand can be expanded in a power series as Functional Calculus that can be integrated term by term. From (18) and (19) ð20Þ which is valid for any A, positive definite or not, of arbitrary condition. One can see that, for a positive definite A and for v ! 0, the expression in the brackets of (20) gives a series expansion for the inverse A À1 (see (14) and (15)). Various expressions for the matrix exponential are obtained by giving particular values to v in (20). For example, for any v ¼ e Àq , q> À ln 2, we have ð21Þ the series becoming less convergent as q increases above q = 0. On the other hand, with v ¼ 1=e in (20) and then A replaced with qA we obtain for any real number q ð22Þ this series being more convergent than that in (21) for greater values of q. For any v ∈ 1, 2 ð Þ, the terms in the series expressions derived from (20) have coefficients that are alternately positive and negative. With v ¼ e 1=2 = 1.64872127, e.g., and then replacing A with 2A we have As well, by replacing A by À qA in (23) one obtains instead of (22) an expression with alternating in sign series coefficients.

Rapidly convergent series formulae
From the basic Eq. (20) we derive now formulae which contain series that have a higher rate of convergence than those presented in the previous subsection.
Firstly, it is obvious that for values of v close to 1 the series in (20) has a high rate of convergence. For instance, with v ¼ 1 þ 10 Àq , 10 Àq ≪ 1, and replacing where c q 1= ln 1 þ 10 Àq ð Þ . This series is rapidly convergent. Secondly, the convergence of the series in the expressions derived from (20) can further be improved by successive integrations. Indeed, integrating both sides of (20) from 1 to v, we have ð25Þ Integrating repeatedly we obtain the identity ð26Þ Same result is obtained by replacing A with pI þ A in (20). This identity contains an infinite series whose coefficients decrease rapidly as p increases. Obviously, for a given A and v, (26) generates more efficient computational formulae than those in the previous subsection. As before, for v ∈ 1, 2 ð Þ the infinite series have coefficients that alternate in sign. For example, with v ¼ e 1=2 in (26) and then replacing A by 2A and p by 2p we have instead of (23) Taking v ¼ 1 þ 10 Àq , with q>0 and c q 1= ln 1 þ 10 Àq ð Þ , (26) gives (compare with (24)) ð28Þ Notice that, in the new formulae derived from (26) the infinite series are very rapidly convergent, with their rate of convergence increasing when the parameter p increases. Highly accurate numerical results can be generated with only a small number of terms retained in the infinite series of these formulae (see Section 4).
Note. All the formulae presented in this section remain valid if A is changed in ÀA. Obviously, in all these expressions A can be replaced by a real number and the identity matrix I by 1, yielding a few novel identities and summation formulae for series of real numbers. Also, the expression in the brackets of (26) for v ! 0 is just I þ A=p ð Þ À1 =p if I þ A=p is positive definite and, thus, we obtain another identity, i.e., 6 Functional Calculus ð29Þ which reduces for A ¼ 0 to an elementary binomial sum.

Solution of general linear systems
In what follows, we apply the matrix exponential formulae from the previous section and present a new iteration procedure and a matrix product formula for the solution of large systems of linear algebraic equations.

An iterative method
Equation (16) can be written in the form To evaluate the amount of computation necessary to obtain the solution of (11) with a certain accuracy, let us take N such that N A k k ¼ 10 when one needs about 30 terms in the infinite series, i.e., 30 matrix-vector multiplications. The number of iterations increases with the condition number of A. To see this and to determine the corresponding number of iterations, consider (11) with b ¼ 0 and A replaced with a diagonal matrix whose entries are positive numbers, the greatest of these being 1, and whose condition is the same as that of A. The solution of this system is 7 New Matrix Series Formulae for Matrix Exponentials and for the Solution of Linear Systems… DOI: http://dx.doi.org/10.5772/intechopen.89342 x S ¼ 0 and the components of the solution of (1), with f v ð Þ ¼ 1=v, are x ok v λ k where λ k , k ¼ 1, 2, …, n, are the entries in the diagonal matrix. In order to make the magnitudes of all these components at least 1%, e.g., of the corresponding magnitudes of the initial components, one needs no iteration if the condition number is less than 2, but 5 and, respectively, 46 iterations are needed if the condition number is 10 and 100.
What is remarkable in the iterative method based on (32) is that, for matrices with same condition number and same norm, the number of iterations required is the same, independently of the size of the matrices. Considering approximately 2n 2 arithmetic operations for one matrix-vector multiplication, where n is the number of equations and unknowns in (11), the total number of arithmetic operations required is, thus, proportional to only n 2 . In the examples given above one has to perform, respectively, 60n 2 , 300n 2 and 2760n 2 arithmetic operations. Assuming only 2n 3 =3 arithmetic operations for the Gaussian elimination procedure, the method presented in this subsection requires less computation for the same examples if, respectively, n>90, n>450 and n>4140. One can also notice that the application of Eq. (32) leads to the actual solution of (11) independently of the small error introduced in the computation at each iteration.

A matrix product formula
The original general system (11) is replaced with an equivalent system such that its solution is obtained in terms of matrix exponentials for which highly convergent and accurate series formulae have been derived in Section 3.
Namely, instead of (11) we use the system ð33Þ where α is a real scalar to be chosen, α 6 ¼ 0, and Assuming A to be positive definite, α is taken positive. Then, since e ÀαA < 1 for a normal matrix, the solution can be expressed as with the norm of the matrix exponentials decreasing when k increases [6] ð36Þ where λ is the smallest eigenvalue of A. b α can be accurately computed by using instead of (34) an equivalent expression, for instance (see (22)) ð37Þ 8

Functional Calculus
If the infinite series in (35) is truncated to k ¼ N S the rest of the series has a norm ð38Þ Much less numerical computation (see below) is needed if the infinite series in (35) is transformed into an infinite product using the identity [5] ð39Þ which is also valid for matrices whose norm is less than 1. Thus, (35) becomes ð40Þ with the norm of the exponentials e À2 k αA decreasing very rapidly when k increases. Truncating the infinite product to k ¼ N P , i.e., N P þ 1 factors, leaves a remaining factor ð41Þ whose norm is ð42Þ Let us compare the maximum value of the norm of the truncated matrix in the brackets of (35) and (40) with that of the corresponding untruncated matrix in order to get a rough estimate of the numbers N S and N P of matrix exponentials involved in the numerical computation to achieve a certain accuracy. This will also allow to estimate the total number of matrix-vector multiplications necessary to obtain the solution. The ratio of the maximum norm of the truncated matrix to the maximum norm of the untruncated matrix in the brackets of (35) and (40) To illustrate the computation complexity when using (35) or (40), assume that A k k ¼ 1 and α ¼ 20. If ρ is imposed to be ≈0:99, for example, one needs N S ¼ 2, i.e., three terms in (35) and N P ¼ 1, i.e., two factors in (40) if λ ¼ 10 À1 . If λ ¼ 10 À2 these numbers increase to N S ¼ 23 and N P ¼ 4, and when λ ¼ 10 À3 one gets N S ¼ 230, but N P only increases to N P ¼ 7. It is clear that applying the formula (40) the number of exponentials needed in the numerical computation is much smaller than that if formula (35) would be applied. For all the matrix exponentials involved in the numerical computation we use the formula (28) containing a highly convergent series, such that ð45Þ where q>o and c q 1= ln 1 þ 10 Àq ð Þ . With α A k k ¼ 20 and choosing q ¼ 1 and p ¼ 10, for instance, e ÀαA is determined accurately by retaining 50 terms in the infinite series and, thus, to multiply e ÀαA with a vector one needs 71 matrix-vector multiplications. To compute x s from (40) one has to use repeatedly the multiplication of e À20A with a vector. For a matrix A with λ ¼ 10 À1 one has to retain two factors in (40) and, thus, to multiply e À20A and e À2Â20A with a vector. This means to use repeatedly 3 times the multiplication of e À20A with a vector which requires, therefore, 3 Â 71 ¼ 213 matrix-vector multiplications. When λ ¼ 10 À2 the infinite product in (40) is truncated at k ¼ N P ¼ 4 and this requires the multiplication of e À2 k Â20A , k ¼ 0, 1, 2, 3, 4, with a vector, i.e., to use repeatedly 31 times the multiplication of e À20A with a vector, for a total of 31 Â 71 ¼ 2201 matrix-vector multiplications. We also have to add the matrix-vector multiplications required to compute b α in (37). A very accurate result for b α when α ¼ 20 can be achieved by applying four times the series in the brackets of (37) for α ¼ 5, each time retaining 30 terms. This requires a total of about 120 multiplications of a matrix I À 5A=k with a vector. In all the matrix-vector multiplications involved when applying (45), the matrices are in the form I AE c q αA=k and become better and better conditioned as k increases.
Adding up the number of arithmetic operations involved shows that, with respect to the classical Gaussian elimination method, the procedure presented in this subsection is advantageous for very large systems (11). Namely, assuming same accuracy and only 2n 3 =3 arithmetic operations for the Gaussian elimination, with the data given above, one has to have n>3 Â 213 þ 120 ð Þ¼999 equations and unknowns if λ ¼ 10 À1 and n>3 Â 2201 þ 120 ð Þ¼6963 equations and unknowns if λ ¼ 10 À2 for the proposed method to be more advantageous. For a given α A k k, one application of e ÀαA requires a determined finite number of matrix-vector multiplications, independently of the size of A. It is remarkable, as for the iterative method in the previous subsection, that for a given condition of A, one has to apply e ÀαA a well-determined number of times and, thus, the total number of arithmetic operations necessary to compute the solution with an imposed accuracy is proportional to only n 2 .
It should be noted that, since the infinite series in the expression (45) is truncated and thus determined with a finite accuracy, the accuracy of the solution x S becomes compromised after a too big a number of matrix exponential-vector multiplications. This is why, the worse conditioned systems (11) should be appropriately preconditioned. Practically, the computation with (40) is continued factor by factor and the accuracy of x is checked after each step.

Solution of general linear systems by numerical integration of differential equations
In this section, we introduce first order differential equations whose numerical integration allows to efficiently find the solution of linear systems of algebraic equations. Differential equations of the type of those in (1), with f v ð Þ ¼ À1 or f v ð Þ ¼ 1=v, cannot be used for this purpose due to the fact that the first and higher order derivatives of x v ð Þ tend to infinite values as x tends to the solution x S of (11) (see Section 2).
Here below, we construct ordinary differential equations for x v ð Þ which satisfy the condition that the first few derivatives are finite when x v ð Þ tends to x S and, therefore, are particularly useful for an accurate computation of x S . Let us consider the system (11) with a symmetric positive definite matrix. A quadratic functional ð46Þ is associated with (11) [6] where r is a real number to be chosen, r>0, and, thus, This is the differential equation to be integrated Higher order derivatives can be worked out if needed. In order to see the behaviour of the derivatives close to the solution x S , Eqs. (49) and (50) Notice that as x tends to x s , when Ax À b k k ε tends to zero, where Another differential equation we present here is ð55Þ with the second derivative In this case, always even for x ! x S , but the second derivative tends to an infinite value when where K s ð Þ is finite and ε Ax À b k k . The relationship between the differentials of the variables v and s in (49) and (55) is and for x ! x S we have (see (54)) The differential Eqs. (49) in v and (55) in s require practically the same amount of computation for their right-hand sides, i.e., one matrix-vector multiplication. The first derivatives dx=dv for r ¼ 2 and dx=ds remain finite when x tends to x S , while the second derivatives increase to infinite values as in (54) and (58). For r ¼ 4 the second derivative in (52) remains finite when x tends to x S (see (54)), while the first derivative and the ratio ds=dv tend to zero as in (54) and (60), respectively. If r>4 even d 2 x=dv 2 tends to zero as in (54).
Equations (49) and (55) can be integrated by classical numerical methods. Since we are not looking for an accurate solution of these equations all along from x o to x S but for finding accurately the final value x ¼ x S , we can use a lower order method, for instance, even Euler's method [7]. This yields an approximate value of x S which is to be used as initial point for repeating the numerical integration procedure. As we get closer to the solution x S , we decrease the step size in order to reduce the error. In the case of Euler's method the error is determined in terms of the norm of the second derivative. Higher order numerical integration methods can also be used in order to increase the computation efficiency.
To find a starting point for the integration procedure which is reasonably close to the solution point, one can minimize F x ð Þ in (46) along the normal direction, followed by a minimization of the distance to the solution point x S along the direction of the normal to F [8]. These two preliminary steps are repeated a few times as needed.
Numerical experiments have been performed applying Euler's method to (49) for r ¼ 2, r ¼ 4 and r ¼ 8, and to (55). Systems (11) of various sizes have been automatically generated and the differential Eqs. (49) for r ¼ 2 and r ¼ 4, and (55) have produced results with the least amount of computation when imposing an accuracy of 1%.

Conclusions
A special type of matrix series are used in Section 2 to express the relationship between some first order ordinary differential equations and systems of linear algebraic equations and, also, in Section 3 to derive efficient formulae for matrix exponentials that allow accurate and stable numerical computations in various applications. The main feature of these series consists in the fact that, starting with their first term which is already a matrix substantially better conditioned than the original problem matrix, each of the subsequent terms is obtained through a multiplication with a better and better conditioned matrix that tends to the identity matrix. The new matrix exponential formulae contain very rapidly convergent series and can be applied to general, arbitrarily conditioned, positive definite or not matrices. They are used in Section 4 for two new methods of solution for general