Open access peer-reviewed chapter

Using Matrix Differential Equations for Solving Systems of Linear Algebraic Equations

Written By

Ioan R. Ciric

Submitted: 23 January 2022 Reviewed: 03 March 2022 Published: 17 April 2022

DOI: 10.5772/intechopen.104209

From the Edited Volume

Matrix Theory - Classics and Advances

Edited by Mykhaylo Andriychuk

Chapter metrics overview

145 Chapter Downloads

View Full Metrics

Abstract

Various ordinary differential equations of the first order have recently been used by the author for the solution of general, large linear systems of algebraic equations. Exact solutions were derived in terms of a new kind of infinite series of matrices which are truncated and applied repeatedly to approximate the solution. In these matrix series, each new term is obtained from the preceding one by multiplication with a matrix which becomes better and better conditioned tending to the identity matrix. Obviously, this helps the numerical computations. For a more efficient computation of approximate solutions of the algebraic systems, we consider new differential equations which are solved by simple techniques of numerical integration. The solution procedure allows to easily control and monitor the magnitude of the residual vector at each step of integration. A related iterative method is also proposed. The solution methods are flexible, permitting various intervening parameters to be changed whenever necessary in order to increase their efficiency. Efficient computation of a rough approximation of the solution, applicable even to poorly conditioned systems, is also performed based on the alternate application of two different types of minimization of associated functionals. A smaller amount of computation is needed to obtain an approximate solution of large linear systems as compared to existing methods.

Keywords

  • matrix equations
  • large linear algebraic systems
  • solution by numerical integration

1. Introduction

Exact analytic expressions in the form of infinite series of matrices for the solution of linear systems of algebraic equations were derived in [1] by integrating associated ordinary differential equations. These differential equations were obtained using a quadratic functional related to the system of algebraic equations and describe the orthogonal trajectories of the family of hypersurfaces representing the functional. More convergent matrix series were presented in [2] which can be applied to approximate the solution of the system of equations. Solution of linear systems based on the numerical integration of differential equations has originally been formulated in [3].

In Section 2 of the present book chapter, we use recently derived highly convergent series formulae for matrix exponentials [3] in order to construct improved iterative methods for solving approximately large systems of algebraic equations. In Section 3, we use novel functionals that allow to formulate differential equations which lead to a substantial increase in the efficiency of the solution process [4]. Independently of the starting point, the paths of integration of these equations converge all to the solution point of the system considered. At each step of the numerical solution, one passes, in fact, from one path to a slightly different one due to computation errors. The procedure does not require to find accurately an entire path but only the solution point which is common to all the paths. This is why we apply the simple Euler method [5] to integrate the differential equations. The computation errors are now determined by the magnitude of the second derivative of the position vector with respect to the parameter defining the location along the path. A related iterative method [6] is also described. In Section 4, two different kinds of minimization of the system functionals are applied alternately for quick computation of a rough solution of large linear systems [7].

Advertisement

2. Matrix series formulae for the solution of linear algebraic systems

Consider a system of equations written in matrix form as

Axb=0E1

where ARn×n is an arbitrary nonsingular matrix, bRn is a given n-dimensional vector and x=x1x2xnT is the unknown n-dimensional vector, with T indicating the transpose. Assume that x is a continuous function of the real variable v over a certain interval and associate to (1) the vector differential equation of the first order

dxdv=fvAxbE2

where fv is a continuous function to be chosen.

2.1 Exact analytic expressions for the solution of (2)

Imposing the condition xvo=xo, xo being a chosen position vector, (2) has a unique solution over a specified interval [4], namely,

xv=xo+eAgvok=0Akk+1!gvk+1gvok+1AxobE3

where gvfvdv is a primitive of fv, i.e., fv=dg/dv.If fv is taken to be fv=1/v, then gv=lnv. Choosing vo=1 gives gvo=0 and (3) becomes now

xv=xo+lnvk=0Alnvkk+1!AxobE4

Over the interval v02,xv can also be expressed in the form [3]

xv=xo1vI+k=11vkk+1IAIA2IAkAxobE5

I denoting the identity matrix of order n. The solution of (1) is theoretically obtained for v=0, the series being in general extremely low convergent. Since the rate of convergence of the series in (5) is very small for v very close to the value corresponding to the solution of (1), i.e., v=0, this formula should be applied repeatedly for a v not too close to zero, v (0,1), until a satisfactory small distance to the solution x0 is reached.

2.2 Highly convergent series iteration formulae

Practical formulae for an approximate solution of (1) can be derived by using matrix series that are much more convergent than the one in (5). This can be done by writing (4) in the form

xv=xo+A1eAlnvIAxobE6

and by expressing the matrix exponential in terms of series of superior convergence given recently in ref. [3]. Very close to the solution v=0, say for v=eN,N1 , we have eAlnv=eNA and with [3]

eNAI=10qcqNAI+k=11k10qkk+1I+cqNA×I+cqNA2I+cqNAk]E7

where q > 0 and cq1/ln1+10q , we get

xeN=xo10qcqNI+k=11k10qkk+1I+cqNA×I+cqNA2I+cqNAk]AxobE8

In order to perform numerical computations, the number of terms in the series has to be appropriately chosen. To have a small cq in (8) one has to take a small q. For q = 0.5, e.g., cq=3.639409. One may start with N = 15 which would require to retain about 50 terms in this alternating series to get a rough approximation of xe15. The computation is repeated with the new xo taken to be the preceding xeN until an acceptable accuracy of the solution of (1) is reached.

Much more convergent formulae are constructed if we apply in (6) the expansion [3]

eNA=1+10qpI+p!k=1p10qkk!pk!IcqNApIcqNAp1IcqNApk+110p+1qcqNAIcqNAIcqNA2IcqNAp1p+1I+p!k=11k10qkk+1k+2k+p+1I+cqNAI+cqNA2I+cqNAk,E9
p=1,2,

The multiplication with A1 from (6) is avoided by arranging the first summation in (9) in terms of powers of A and by taking into account that

p!k=1p10qkk!pk!=1+10qp1E10

It is obvious that using (9) leads to computations with a much smaller number of terms retained in the series for a given N. With the same amount of computation we obtain an xeN which is much closer to the exact solution x0 of the system of equations (1).

Advertisement

3. Methods of numerical integration

In this Section, we use special kinds of functionals that lead to the construction of differential equations which allow a substantial increase in the efficiency of the solution of large linear algebraic systems.

3.1 Vector differential equations and their application to the solution of (1)

Consider a functional of the form

Fx=AxbαE11

associated to (1) where α is a positive real number to be chosen. A real variable v,v>0,vvovS, is now defined by

Fxv=FxvohvE12

where hv is an appropriately selected real function with definite first and second derivatives in vovS,vo corresponding to a starting point xvox0, with hvo=1, and vS corresponding to the solution vector of (1) xvSxS, with hvS=0. Denoting Fx0Fo, we have

Fodhdv=αAxbα2ATAxbTdxdvE13

Thus,

dxdv=FoαdhdvATAxbAxbα2ATAxb2E14

which is the differential equation to be integrated from v=vo to v=vS. The second derivative of x is

d2xdv2=Foαd2hdv2ATAxbAxbα2ATAxb2+Foα2dhdv21Axb2α1ATAxb2×Axb2ATAxb2ATA2AATAxb2ATAxb2I+2αI}ATAxbE15

From (11) and (12) we get a useful relationship,

Axb=Fohv1/αE16

that allows to simply monitor the magnitude of the residual vector of (1) during the computation process.

As explained in the Introduction we apply the Euler method for the solution of (14) and compute successively

xi+1=xi+ηdxdvx=xi,i=0,1,2,E17

where η is the step size. In the absence of any hint about a good starting point x0 corresponding to v=vo, we have used the point along the normal from the origin x=0 to the surface Fx=const in (11) which is the closest to the solution point xS [8], i.e.,

x0=b2ATb2ATbE18

The function hv and the parameter α are chosen such that the first and the second derivatives of x in (14) and (15) remain finite when vvS, i.e., when the residual Axb0, while the errors evaluated with the second derivative are kept reasonably small along the interval of integration vvovS. For each hv,α is determined by imposing the condition that the second derivative in (15) tends to zero as Axb0. To decide on the value of α for a given hv, we require to have a good rate of decrease of Axb at the beginning of the computation process, for instance to have (see (11) and (12))

Ax1bAx0b=hη1/α0.8E19

when η=0.1. Finally, to solve (1), we determine the actual step size to be used for the numerical solution such that the errors at the beginning of the computation process are small, say

η221x1d2xdv2x=x0<0.01E20

If the magnitude of the residual vector, which is computed at each step, does not decrease anymore the computation is continued with a new cycle of integration by applying (17) with x0 replaced by a new starting point, i.e.,

xnew=xlastAxlastb2ATAxlastb2×ATAxlastbE21

where xlast is the position vector from the preceding step. xnew is the closest point to xS along the normal to the surface Fx=consttaken at xlast[8]. Subsequent cycles of integration are performed in the same way until a satisfactory approximate solution of (1) is obtained. If the difference between xnew and xlast is insignificant we find the point along the normal to Fx=const, taken at xnew, where F has a minimum and, then, apply (21) again. It should be noted that as one approaches the solution point the direction of the gradient ATAxb tends to become more and more perpendicular to the direction of xSx and, thus, the residual Axb and the relative error xxS/xS will not decrease any more as expected. This is why the computation has to be continued by opening a new cycle of integration. For systems with higher condition numbers, this happens more quickly.

Numerical experiments obtained using hv=1v with α=0.45 and also using hv=1v2 with α=0.9 shows that only two up to five integration cycles with a step size of 0.1 are needed in order to get an accuracy of about 1% for the solution of systems with condition numbers up to 100.

3.2 A related iterative method

The basic idea of this method is to find, starting from a point x0, a point x1 along the gradient of a functional (11) associated with the general system (1), such that the magnitude of the new residual vector is an established fraction of its initial value, Ax1b=τAx0b,τ<1. Instead of performing an integration as in Section 3.1, one proceeds iteratively, i.e.,

Axi+1b=τAxib,i=0,1,2,E22

for each iteration the starting point being the point found in the preceding iteration, with τ maintained as tight as possible at the same value. To do this, we impose that Fx in (11) varies from xi to xi+1 in the same way for each iteration, namely,

Fx=FxihvE23

where hv is now a real function of a real variable, monotone decreasing in the interval vvovo+η,vo0,η>0, with definite first and second derivatives, and with hvo=1 such that

Fxi+1=Fxihvo+η,i=0,1,2,E24

Then, from (11),

Fxi+1Fxi=Axi+1bαAxibα=ταE25

and

τ=hvo+η1/αE26

xi+1 is computed as

xi+1=xi+ηdxdvv=voE27

i.e., with (14) where Fo is replaced with Fxi,

xi+1=xi+ηαdhdvv=voAxib2ATAxib2×ATAxib,i=0,1,2,E28

in which, taking into account (21), one has to have η/αdh/dvv=vo<1. This expression corresponds to the first step in the numerical integration by Euler’s method of the differential Eq. (14), starting from xi with a step η, or to the first two terms of the Taylor series expansion of xvo+η.

The starting value x0 and the function hv are chosen in the same way as in Section 3.1. For selected ratios η/α the iteration cycle continues as long as the residual Axib decreases at a proper rate. Theoretically, to make Axib=εAx0b with ε1 one would need to conduct lnε/lnτ iterations. Since computation errors are introduced at each iteration (as in the Euler method), the initially chosen value of τ cannot be maintained the same as the iterative process continues. An approximate solution of (1) is obtained at the end of the iteration cycle as before, applying (21). Subsequent iteration cycles are performed with the starting point in each cycle being the point determined in the preceding cycle.

Numerical results were generated using, as in the method presented in Section 3.1, hv=1v but with η/α=0.2 and hv=1v2 with η/α=0.1. Now, in both cases, vo=0,η/αdh/dvv=0=0.2 in (28), and τ0.8. A substantial increase in τ, approaching τ=1, or oscillations of its value during the first few iterations show that the computation errors evaluated from (15) (with Fo replaced by Axbα) are too big. In such a situation, one has to decrease the factor η/αdh/dvv=0 in (28), i.e., to decrease η/α for a given hv, which leads to an increase of τ and a decrease of 1/2η2d2x/dv2v=vo. For accuracy of 1% for the solution of (1), one needs a number of three to six short iteration cycles, with about eight iterations per cycle, for systems with condition numbers below 100. Of course, as in the previous method, an increased number of iteration cycles is required to reach the same accuracy for the solution of poorly conditioned systems.

Remarks. One can easily see that xi+1 in (28) can be expressed in the form

xi+1=x0+=0ixd,i=0,1,2,E29

where

xdx+1x=ηαdhdvv=voAxb2ATAxb2×ATAxbE30

with the solution of (1) given by the infinite series

xS=x0+xd0+xd1+xd2+E31

xd in (30) can also be written as

xd=ηαdhdvv=vob2ATb2ATbE32

where

b=Axd1b1,=1,2,;b0=Ax0bE33

This shows that the difference xd is a “rough approximation” to the solution of a system (1) whose right-hand side is, at each iteration, just the residual vector of the system in the preceding iteration, which decreases in magnitude from one iteration to the next. Thus, the method presented in this Section represents a practical, concrete implementation of the well-known idea of successive approximations/perturbations [9].

To search for a possible increase in efficiency, more general functionals of the form

Fx=FAxbE34

could be tested, where F and its first derivative are finite and continuous at all the points within the interval of integration. Now the corresponding (14) is

dxdv=FodhdvdFdAxb1AxbATAxb2×ATAxbE35

Note. The paths of integration in the methods presented can be looked at as being the field lines of a Poissonian electrostatic field in a homogeneous region bounded by a surface of constant potential Fx=Axobα and with a zero potential at the solution point, FxS=0. In the particular case of α=2 the ratio of the volume density of charge within the region to the material permittivity is constant, namely, 2i=1nk=1naik2 where aik are the entries of A. By altering this electrostatic field one could eventually make quicker the approach to the solution point along the integration path.

Advertisement

4. Method of alternate minimizations

This simple method is based on the property of a functional of the form (34) associated with a general system of equations (1) to allow not only to minimize the value of the functional but also the distance to the solution point of (1). Using only minimizations along ordinary gradients of the functional is not efficient unless the system is very well-conditioned.

For computing efficiently an approximate solution of general, large linear systems of algebraic equations, we propose in this Section the alternate application of minimizations of a functional and of the distance to the solution point, along the direction of the gradient of the functional.

Consider the functional in (34) where F is a real function defined for all x from a chosen starting point xs to the solution point xS of (1), monotone decreasing with Axb,F0=FxS. The gradient of Fx is

Fx=dFAxbdAxbATAxbAxbE36

i.e., in the direction of the vector ATAxb.

The minimum of Fx along a straight line through xs in the direction defined by an n-dimensional vector d is found from the condition that xxs=λd, where λis a scalar to be determined, and ATAxb are perpendicular. This gives λand x for the minimum of Fx, namely,

xminFd=xsAdTAxsbAd2dE37

The point at which Fx is minimum along the normal taken at xs is determined by replacing d with ATAxsb in (37),

xminF=xsATAxsb2AATAxsb2ATAxsbE38

The minimum distance between the solution point xS and a point x along the direction of the gradient of Fx taken at xs is at

x=xs+μATAxbE39

where the scalar μ is determined by requiring the distance xxS to be minimum. This gives μ and x for this minimum, namely,

xminD=xsAxsb2ATAxsb2ATAxsbE40

which depends only on the residual vector and on ATAxb at the point xs, independently of the form of F.

As already mentioned, repeated minimizations using only (38) are not efficient for solving a general system (1). The same is true when using only (40). To obtain an approximate solution of (1), we apply the formulae (40) and (38) alternately, the starting point xs being each time the point determined in the preceding minimization. As in Section 3, when there is no indication about a convenient first starting point xs, one can use the origin xs=0. Only a few iterations, up to ten, are needed for a solution with a relative error xxS/xS of about 1% for systems with condition numbers of up to 100.

The procedure is surprisingly efficient for a rough solution even for very ill-conditioned systems. For example, for the system Hx=b where H is the Hilbert matrix of order eight and whose solution is xS=111T [10], a solution of 6% accuracy is obtained with a starting point xs=0 by performing only seven alternate minimizations (40), (38), with no equilibration or regularization preoperated on the system. By comparison [10], for the Gauss elimination method, the accuracy is only 40.6%, for the Gauss elimination with equilibration 9.15%, and for the Householder method with equilibration 5.6%.

Experimental results show that, as the new point xgiven by (40), (38) becomes closer to the solution point xS, the direction ATAxb of the gradient in (36), i.e., of the correction terms in (40) and (38), becomes more and more orthogonal to the direction of xxS. This causes the magnitude Axb of the residual vector of (1) and the relative difference xxS/xS to not significantly decrease anymore. To progress with the computation one can intercalate minimizations of Fx along directions that are different from that of ATAxb. By numerical testing, it is observed that xx0, where x0 is the original starting point, has at this stage, in most cases, a significant component along the direction of xxS. Thus, one can use such a minimization direction to try to improve the solution accuracy.

Advertisement

5. Conclusions

Highly convergent iteration formulae for solving general, large linear systems of algebraic equations are derived from exact analytic solutions of particular differential equations based on new, accurate series representations of the matrix exponential recently published in ref. [3]. Specialized differential equations which make it possible to monitor and control the computation errors and the decrease of the magnitude of the residual vector Axb of (1) at each stage of the computation procedure are constructed and integrated numerically to approximate the solution of these systems. Two methods of the solution have been presented in this book chapter and the simplest Euler method is applied for the numerical integration of the vector differential equations. In the first method cycles of integration are used, each cycle starting from a convenient value of the unknown and continuing until the rate of decrease of Axb becomes too small. The second method is an iterative method where a fixed rate of decrease of Axb is imposed at the beginning of the iteration cycles. In this method, only the first step in the Euler method is computed at each iteration and each cycle of iteration is conducted until there is no significant change in the magnitude of the residual vector. These two methods are highly efficient for large systems with condition numbers below 100 since only up to six cycles with less than ten steps per cycle are necessary to get a solution accuracy of 1%, at each step within a cycle having to compute two matrix-vector multiplications. The method in Section 3.2 seems to be more efficient for systems with bigger condition numbers. The number of cycles of integration/iteration increases with the condition number and preconditioning should be done for ill-conditioned systems before attempting to apply the methods presented in this work.

The iterative method of alternate minimizations presented in Section 4 is intended for computing quickly a rough approximation of the solution of linear systems of equations. In this method, preequilibration or preregularization/preconditioning are not required to obtain useful results even for systems with poorly conditioned matrices.

The present Book Chapter has been intended to constitute a review of the work done so far on the subject matter. It describes the proposed new methods for an approximate solution of large linear algebraic systems using appropriately chosen matrix functionals and shows the procedures for constructing the concrete solution algorithms. At this stage, the results presented are only validated by preliminary numerical experiments which indicate the efficiency of the proposed procedures for deriving approximate/rough solutions of large systems. More numerical experiments involving systems with higher condition numbers, as well as theoretical results, will be presented in future work. It is my hope that other researchers will be attracted to this new area and rigorous theoretical results will also be established (theorems, etc.).

References

  1. 1. Ciric IR. Series solutions of vector differential equations related to linear systems of algebraic equations. In: Proceedings of the 12th International Symposium on Antenna Technology and Applied Electromagnetics and Canadian Radio Sciences Conference (ANTEM/URSI). Montréal, Canada: IEEE; 2006. pp. 317-321. ISBN: 978-0-9638425-1-7
  2. 2. Ciric IR. Rapidly convergent matrix series for solving linear systems of equations. In: Proceedings of the 17th International Symposium on Antenna Technology and Applied Electromagnetics (ANTEM). Montréal, Canada: IEEE; 2016. DOI: 10.1109/ANTEM.2016.755022/978-1-4673-8478-0
  3. 3. Ciric IR. New matrix series formulae for matrix exponentials and for the solution of linear systems of algebraic equations. In: Shah K, Okutmustur B, editors. Functional Calculus. London: IntechOpen; 2020. DOI: 10.5772/Intechopen.2022.77599/978-1-83880-007-9
  4. 4. Ciric IR. Approximate solution of large linear algebraic systems using differential equations. In: Proceedings of the 19th International Symposium on Antenna Technology and Applied Electromagnetics (ANTEM). Winnipeg, Canada: IEEE; 2021. ISBN: 978-1-6654-0335-1
  5. 5. Burden RL, Faires JD. Numerical Analysis. 5th ed. Boston: PWS-KENT; 1993. ISBN: 0-534-93219-3
  6. 6. Ciric IR. Using selected rates of residual vector decrease for solving iteratively large linear systems. In: Proceedings of the 19th International Symposium on Antenna Technology and Applied Electromagnetics (ANTEM). Winnipeg, Canada: IEEE; 2021. ISBN: 978-1-6654-0335-1
  7. 7. Ciric IR. Alternate minimizations for an approximate solution of general systems of linear algebraic equations. In: Proceedings of the 19th International Symposium on Antenna Technology and Applied Electromagnetics (ANTEM). Winnipeg, Canada: IEEE; 2021. ISBN: 978-1-6654-0335-1
  8. 8. Ciric IR. A geometric property of the functional associated with general systems of algebraic equations. In: Proceedings of the 12th International Symposium on Antenna Technology and Applied Electromagnetics and Canadian Radio Sciences Conference (ANTEM/URSI). Montréal, Canada: IEEE; 2006. pp. 311-315. ISBN: 978-0-9638425-1-7
  9. 9. Lanczos C. Applied Analysis. New York: Dover Publications, Inc.; 1988. ISBN: 0-486-65656-X (Paperback)
  10. 10. Hämmerlin G, Hoffmann K-H. Numerical Mathematics. New York: Springer-Verlag; 1991

Written By

Ioan R. Ciric

Submitted: 23 January 2022 Reviewed: 03 March 2022 Published: 17 April 2022