Open access peer-reviewed chapter - ONLINE FIRST

Using Matrix Differential Equations for Solving Systems of Linear Algebraic Equations

Written By

Ioan R. Ciric

Submitted: January 23rd, 2022Reviewed: March 3rd, 2022Published: April 17th, 2022

DOI: 10.5772/intechopen.104209

IntechOpen
Matrix Theory - Classics and AdvancesEdited by Mykhaylo Andriychuk

From the Edited Volume

Matrix Theory - Classics and Advances [Working Title]

Dr. Mykhaylo I. Andriychuk

Chapter metrics overview

12 Chapter Downloads

View Full Metrics

Abstract

Various ordinary differential equations of the first order have recently been used by the author for the solution of general, large linear systems of algebraic equations. Exact solutions were derived in terms of a new kind of infinite series of matrices which are truncated and applied repeatedly to approximate the solution. In these matrix series, each new term is obtained from the preceding one by multiplication with a matrix which becomes better and better conditioned tending to the identity matrix. Obviously, this helps the numerical computations. For a more efficient computation of approximate solutions of the algebraic systems, we consider new differential equations which are solved by simple techniques of numerical integration. The solution procedure allows to easily control and monitor the magnitude of the residual vector at each step of integration. A related iterative method is also proposed. The solution methods are flexible, permitting various intervening parameters to be changed whenever necessary in order to increase their efficiency. Efficient computation of a rough approximation of the solution, applicable even to poorly conditioned systems, is also performed based on the alternate application of two different types of minimization of associated functionals. A smaller amount of computation is needed to obtain an approximate solution of large linear systems as compared to existing methods.

Keywords

  • matrix equations
  • large linear algebraic systems
  • solution by numerical integration

1. Introduction

Exact analytic expressions in the form of infinite series of matrices for the solution of linear systems of algebraic equations were derived in [1] by integrating associated ordinary differential equations. These differential equations were obtained using a quadratic function related to the system of algebraic equations and describe the orthogonal trajectories of the family of hypersurfaces representing the functional. More convergent matrix series were presented in [2] which can be applied to approximate the solution of the system of equations. Solution of linear systems based on the numerical integration of differential equations has originally been formulated in [3].

In Section 2 of the present book chapter, we use recently derived highly convergent series formulae for matrix exponentials [3] in order to construct improved iterative methods for solving approximately large systems of algebraic equations. In Section 3, we use novel functionals that allow to formulate differential equations which lead to a substantial increase in the efficiency of the solution process [4]. Independently of the starting point, the paths of integration of these equations converge all to the solution point of the system considered. At each step of the numerical solution, one passes, in fact, from one path to a slightly different one due to computation errors. The procedure does not require to find accurately an entire path but only the solution point which is common to all the paths. This is why we apply the simple Euler method [5] to integrate the differential equations. The computation errors are now determined by the magnitude of the second derivative of the position vector with respect to the parameter defining the location along the path. A related iterative method [6] is also described. In Section 4, two different kinds of minimization of the system functionals are applied alternately for quick computation of a rough solution of large linear systems [7].

Advertisement

2. Matrix series formulae for the solution of linear algebraic systems

Consider a system of equations written in matrix form as

Axb=0E1

where ARn×nis an arbitrary nonsingular matrix, bRnis a given n-dimensional vector and x=x1x2xnTis the unknown n-dimensional vector, with Tindicating the transpose. Assume that xis a continuous function of the real variable vover a certain interval and associate to (1) the vector differential equation of the first order

dxdv=fvAxbE2

where fvis a continuous function to be chosen.

2.1 Exact analytic expressions for the solution of (2)

Imposing the condition xvo=xo,xobeing a chosen position vector, (2) has a unique solution over a specified interval [4], namely,

xv=xo+eAgvok=0Akk+1!gvk+1gvok+1AxobE3

where gvfvdvis a primitive of fv,i.e., fv=dg/dv.If fvis taken to be fv=1/v,then gv=lnv.Choosingvo=1gives gvo=0and (3) becomes now

xv=xo+lnvk=0Alnvkk+1!AxobE4

Over the intervalv02,xvcan also be expressed in the form [3]

xv=xo1vI+k=11vkk+1IAIA2IAkAxobE5

Idenoting the identity matrix of order n. The solution of (1) is theoretically obtained for v=0, the series being in general extremely low convergent. Since the rate of convergence of the series in (5) is very small for vvery close to the value corresponding to the solution of (1), i.e., v=0, this formula should be applied repeatedly for a vnot too close to zero, v(0,1), until a satisfactory small distance to the solution x0is reached.

2.2 Highly convergent series iteration formulae

Practical formulae for an approximate solution of (1) can be derived by using matrix series that are much more convergent than the one in (5). This can be done by writing (4) in the form

xv=xo+A1eAlnvIAxobE6

and by expressing the matrix exponential in terms of series of superior convergence given recently in ref. [3]. Very close to the solution v=0, say for v=eN,N1, we have eAlnv=eNAand with [3]

eNAI=10qcqNAI+k=11k10qkk+1I+cqNA×I+cqNA2I+cqNAk]E7

where q> 0 and cq1/ln1+10q, we get

xeN=xo10qcqNI+k=11k10qkk+1I+cqNA×I+cqNA2I+cqNAk]AxobE8

In order to perform numerical computations, the number of terms in the series has to be appropriately chosen. To have a small cqin (8) one has to take a small q. For q= 0.5, e.g., cq=3.639409. One may start with N=15 which would require to retain about 50 terms in this alternating series to get a rough approximation of xe15. The computation is repeated with the new xotaken to be the preceding xeNuntil an acceptable accuracy of the solution of (1) is reached.

Much more convergent formulae are constructed if we apply in (6) the expansion [3]

eNA=1+10qpI+p!k=1p10qkk!pk!IcqNApIcqNAp1IcqNApk+110p+1qcqNAIcqNAIcqNA2IcqNAp1p+1I+p!k=11k10qkk+1k+2k+p+1I+cqNAI+cqNA2I+cqNAk,E9
p=1,2,

The multiplication with A1from (6) is avoided by arranging the first summation in (9) in terms of powers of Aand by taking into account that

p!k=1p10qkk!pk!=1+10qp1E10

It is obvious that using (9) leads to computations with a much smaller number of terms retained in the series for a given N. With the same amount of computation we obtain an xeNwhich is much closer to the exact solution x0of the system of equations (1).

Advertisement

3. Methods of numerical integration

In this Section, we use special kinds of functionals that lead to the construction of differential equations which allow a substantial increase in the efficiency of the solution of large linear algebraic systems.

3.1 Vector differential equations and their application to the solution of (1)

Consider a functional of the form

Fx=AxbαE11

associated to (1) where αis a positive real number to be chosen. A real variable v,v>0,vvovS, is now defined by

Fxv=FxvohvE12

where hvis an appropriately selected real function with definite first and second derivatives in vovS,vocorresponding to a starting point xvox0,with hvo=1, and vScorresponding to the solution vector of 1xvSxS,with hvS=0. Denoting Fx0Fo, we have

Fodhdv=αAxbα2ATAxbTdxdvE13

Thus,

dxdv=FoαdhdvATAxbAxbα2ATAxb2E14

which is the differential equation to be integrated from v=voto v=vS. The second derivative of xis

d2xdv2=Foαd2hdv2ATAxbAxbα2ATAxb2+Foα2dhdv21Axb2α1ATAxb2×Axb2ATAxb2ATA2AATAxb2ATAxb2I+2α}IATAxbE15

From (11) and (12) we get a useful relationship,

Axb=Fohv1/αE16

that allows to simply monitor the magnitude of the residual vector of (1) during the computation process.

As explained in the Introduction we apply the Euler method for the solution of (14) and compute successively

xi+1=xi+ηdxdvx=xi,i=0,1,2,E17

where ηis the step size. In the absence of any hint about a good starting point x0corresponding to v=vo, we have used the point along the normal from the origin x=0to the surface Fx=constin (11) which is the closest to the solution point xS[8], i.e.,

x0=b2ATb2ATbE18

The function hvand the parameter αare chosen such that the first and the second derivatives of xin (14) and (15) remain finite when vvS, i.e., when the residual Axb0, while the errors evaluated with the second derivative are kept reasonably small along with the interval of integration vvovS. For each hv,αis determined by imposing the condition that the second derivative in (15) tends to zero as Axb0. To decide on the value of αfor a given hv, we require to have a good rate of decrease of Axbat the beginning of the computation process, for instance to have (see (11) and (12))

Ax1bAx0b=hη1/α0.8E19

when η=0.1. Finally, to solve (1), we determine the actual step size to be used for the numerical solution such that the errors at the beginning of the computation process are small, say

η221x1d2xdv2x=x0<0.01E20

If the magnitude of the residual vector, which is computed at each step, does not decrease anymore the computation is continued with a new cycle of integration by applying (17) with x0replaced by a new starting point, i.e.,

xnew=xlastAxlastb2ATAxlastb2×ATAxlastbE21

where xlastis the position vector from the preceding step. xnewis the closest point to xSalong the normal to the surface Fx=consttaken at xlast[8]. Subsequent cycles of integration are performed in the same way until a satisfactory approximate solution of (1) is obtained. If the difference between xnewand xlastis insignificant we find the point along the normal to Fx=const, taken at xnew, where Fhas a minimum and, then, apply (21) again. It should be noted that as one approaches the solution point the direction of the gradient ATAxbtends to become more and more perpendicular to the direction of xSxand, thus, the residual Axband the relative error xxS/xSwill not decrease any more as expected. This is why the computation has to be continued by opening a new cycle of integration. For systems with higher condition numbers, this happens more quickly.

Numerical experiments obtained using hv=1vwith α=0.45and also using hv=1v2with α=0.9shows that only two up to five integration cycles with a step size of 0.1 are needed in order to get an accuracy of about 1% for the solution of systems with condition numbers up to 100.

3.2 A related iterative method

The basic idea of this method is to find, starting from a point x0, a point x1along the gradient of a functional (11) associated with the general system (1), such that the magnitude of the new residual vector is an established fraction of its initial value, Ax1b=τAx0b,τ<1. Instead of performing an integration as in Section 3.1, one proceeds iteratively, i.e.,

Axi+1b=τAxib,i=0,1,2,E22

for each iteration the starting point being the point found in the preceding iteration, with τ maintained as tight as possible at the same value. To do this, we impose that Fxin (11) varies from xito xi+1in the same way for each iteration, namely,

Fx=FxihvE23

where hvis now a real function of a real variable, monotone decreasing in the interval vvovo+η,vo0,η>0,with definite first and second derivatives, and with hvo=1such that

Fxi+1=Fxihvo+η,i=0,1,2,E24

Then, from (11),

Fxi+1Fxi=Axi+1bαAxibα=ταE25

and

τ=hvo+η1/αE26

xi+1is computed as

xi+1=xi+ηdxdvv=voE27

i.e., with (14) where Fois replaced with Fxi,

xi+1=xi+ηαdhdvv=voAxib2ATAxib2×ATAxib,i=0,1,2,E28

in which, taking into account (21), one has to have η/αdh/dvv=vo<1. This expression corresponds to the first step in the numerical integration by Euler’s method of the differential Eq. (14), starting from xiwith a step η, or to the first two terms of the Taylor series expansion of xvo+η.

The starting value x0and the function hvare chosen in the same way as in Section 3.1. For selected ratios η/αthe iteration cycle continues as long as the residual Axibdecreases at a proper rate. Theoretically, to make Axib=εAx0bwith ε1one would need to conduct lnε/lnτiterations. Since computation errors are introduced at each iteration (as in the Euler method), the initially chosen value of τ cannot be maintained the same as the iterative process continues. An approximate solution of (1) is obtained at the end of the iteration cycle as before, applying (21). Subsequent iteration cycles are performed with the starting point in each cycle being the point determined in the preceding cycle.

Numerical results were generated using, as in the method presented in Section 3.1, hv=1vbut with η/α=0.2and hv=1v2with η/α=0.1. Now, in both cases, vo=0,η/αdh/dvv=0=0.2in (28), and τ0.8.A substantial increase in τ, approaching τ=1, or oscillations of its value during the first few iterations show that the computation errors evaluated from (15) (with Foreplaced by Axbα) are too big. In such a situation, one has to decrease the factor η/αdh/dvv=0in (28), i.e., to decrease η/αfor a given hv, which leads to an increase of τ and a decrease of 1/2η2d2x/dv2v=vo. For accuracy of 1% for the solution of (1), one needs a number of three to six short iteration cycles, with about eight iterations per cycle, for systems with condition numbers below 100. Of course, as in the previous method, an increased number of iteration cycles is required to reach the same accuracy for the solution of poorly conditioned systems.

Remarks.One can easily see that xi+1in (28) can be expressed in the form

xi+1=x0+=0ixd,i=0,1,2,E29

where

xdx+1x=ηαdhdvv=voAxb2ATAxb2×ATAxbE30

with the solution of (1) given by the infinite series

xS=x0+xd0+xd1+xd2+E31

xdin (30) can also be written as

xd=ηαdhdvv=vob2ATb2ATbE32

where

b=Axd1b1,=1,2,;b0=Ax0bE33

This shows that the difference xdis a “rough approximation” to the solution of a system (1) whose right-hand side is, at each iteration, just the residual vector of the system in the preceding iteration, which decreases in magnitude from one iteration to the next. Thus, the method presented in this Section represents a practical, concrete implementation of the well-known idea of successive approximations/perturbations [9].

To search for a possible increase in efficiency, more general functionals of the form

Fx=FAxbE34

could be tested, where Fand its first derivative are finite and continuous at all the points within the interval of integration. Now the corresponding (14) is

dxdv=FodhdvdFdAxb1AxbATAxb2×ATAxbE35

Note.The paths of integration in the methods presented can be looked at as being the field lines of a Poissonian electrostatic field in a homogeneous region bounded by a surface of constant potential Fx=Axobαand with a zero potential at the solution point, FxS=0. In the particular case of α=2the ratio of the volume density of charge within the region to the material permittivity is constant, namely, 2i=1nk=1naik2where aikare the entries of A.By altering this electrostatic field one could eventually make quicker the approach to the solution point along the integration path.

Advertisement

4. Method of alternate minimizations

This simple method is based on the property of a functional of the form (34) associated with a general system of equations (1) to allow not only to minimize the value of the functional but also the distance to the solution point of (1). Using only minimizations along ordinary gradients of the functional is not efficient unless the system is very well-conditioned.

For computing efficiently an approximate solution of general, large linear systems of algebraic equations, we propose in this Section the alternate application of minimizations of a functional and of the distance to the solution point, along the direction of the gradient of the functional.

Consider the functional in (34) where Fis a real function defined for all xfrom a chosen starting point xsto the solution point xSof (1), monotone decreasing with Axb,F0=FxS. The gradient of Fxis

Fx=dFAxbdAxbATAxbAxbE36

i.e., in the direction of the vector ATAxb.

The minimum of Fxalong a straight line through xsin the direction defined by an n-dimensional vector dis found from the condition that xxs=λd, where λis a scalar to be determined, and ATAxbare perpendicular. This gives λand xfor the minimum of Fx, namely,

xminFd=xsAdTAxsbAd2dE37

The point at which Fxis minimum along the normal taken at xsis determined by replacing dwith ATAxsbin (37),

xminF=xsATAxsb2AATAxsb2ATAxsbE38

The minimum distance between the solution point xSand a point xalong the direction of the gradient of Fxtaken at xsis at

x=xs+μATAxbE39

where the scalar μis determined by requiring the distance xxSto be minimum. This gives μand xfor this minimum, namely,

xminD=xsAxsb2ATAxsb2ATAxsbE40

which depends only on the residual vector and on ATAxbat the point xs, independently of the form of F.

As already mentioned, repeated minimizations using only (38) are not efficient for solving a general system (1). The same is true when using only (40). To obtain an approximate solution of (1), we apply the formulae (40) and (38) alternately, the starting point xsbeing each time the point determined in the preceding minimization. As in Section 3, when there is no indication about a convenient first starting point xs, one can use the origin xs=0.Only a few iterations, up to ten, are needed for a solution with a relative error xxS/xSof about 1% for systems with condition numbers of up to 100.

The procedure is surprisingly efficient for a rough solution even for very ill-conditioned systems. For example, for the system Hx=bwhere His the Hilbert matrix of order eight and whose solution is xS=111T[10], a solution of 6% accuracy is obtained with a starting point xs=0by performing only seven alternate minimizations (40), (38), with no equilibration or regularization preoperated on the system. By comparison [10], for the Gauss elimination method, the accuracy is only 40.6%, for the Gauss elimination with equilibration 9.15%, and for the Householder method with equilibration 5.6%.

Experimental results show that, as the new point xgiven by (40), (38) becomes closer to the solution point xS, the direction ATAxbof the gradient in (36), i.e., of the correction terms in (40) and (38), becomes more and more orthogonal to the direction of xxS. This causes the magnitude Axbof the residual vector of (1) and the relative difference xxS/xSto not significantly decrease anymore. To progress with the computation one can intercalate minimizations of Fxalong directions that are different from that of ATAxb. By numerical testing, it is observed that xx0, where x0is the original starting point, has at this stage, in most cases, a significant component along the direction of xxS. Thus, one can use such a minimization direction to try to improve the solution accuracy.

Advertisement

5. Conclusions

Highly convergent iteration formulae for solving general, large linear systems of algebraic equations are derived from exact analytic solutions of particular differential equations based on new, accurate series representations of the matrix exponential recently published in ref. [3]. Specialized differential equations which make it possible to monitor and control the computation errors and the decrease of the magnitude of the residual vector Axbof (1) at each stage of the computation procedure are constructed and integrated numerically to approximate the solution of these systems. Two methods of the solution have been presented in this book chapter and the simplest Euler method is applied for the numerical integration of the vector differential equations. In the first method cycles of integration are used, each cycle starting from a convenient value of the unknown and continuing until the rate of decrease of Axbbecomes too small. The second method is an iterative method where a fixed rate of decrease of Axbis imposed at the beginning of the iteration cycles. In this method, only the first step in the Euler method is computed at each iteration and each cycle of iteration is conducted until there is no significant change in the magnitude of the residual vector. These two methods are highly efficient for large systems with condition numbers below 100 since only up to six cycles with less than ten steps per cycle are necessary to get a solution accuracy of 1%, at each step within a cycle having to compute two matrix-vector multiplications. The method in Section 3.2 seems to be more efficient for systems with bigger condition numbers. The number of cycles of integration/iteration increases with the condition number and preconditioning should be done for ill-conditioned systems before attempting to apply the methods presented in this work.

The iterative method of alternate minimizations presented in Section 4 is intended for computing quickly a rough approximation of the solution of linear systems of equations. In this method, preequilibration or preregularization/preconditioning are not required to obtain useful results even for systems with poorly conditioned matrices.

The present Book Chapter has been intended to constitute a review of the work done so far on the subject matter. It describes the proposed new methods for an approximate solution of large linear algebraic systems using appropriately chosen matrix functionals and shows the procedures for constructing the concrete solution algorithms. At this stage, the results presented are only validated by preliminary numerical experiments which indicate the efficiency of the proposed procedures for deriving approximate/rough solutions of large systems. More numerical experiments involving systems with higher condition numbers, as well as theoretical results, will be presented in future work. It is my hope that other researchers will be attracted to this new area and rigorous theoretical results will also be established (theorems, etc.).

References

  1. 1.Ciric IR. Series solutions of vector differential equations related to linear systems of algebraic equations. In: Proceedings of the 12th International Symposium on Antenna Technology and Applied Electromagnetics and Canadian Radio Sciences Conference (ANTEM/URSI). Montréal, Canada: IEEE; 2006. pp. 317-321. ISBN: 978-0-9638425-1-7
  2. 2.Ciric IR. Rapidly convergent matrix series for solving linear systems of equations. In: Proceedings of the 17th International Symposium on Antenna Technology and Applied Electromagnetics (ANTEM). Montréal, Canada: IEEE; 2016. DOI: 10.1109/ANTEM.2016.755022/978-1-4673-8478-0
  3. 3.Ciric IR. New matrix series formulae for matrix exponentials and for the solution of linear systems of algebraic equations. In: Shah K, Okutmustur B, editors. Functional Calculus. London: IntechOpen; 2020. DOI: 10.5772/Intechopen.2022.77599/978-1-83880-007-9
  4. 4.Ciric IR. Approximate solution of large linear algebraic systems using differential equations. In: Proceedings of the 19th International Symposium on Antenna Technology and Applied Electromagnetics (ANTEM). Winnipeg, Canada: IEEE; 2021. ISBN: 978-1-6654-0335-1
  5. 5.Burden RL, Faires JD. Numerical Analysis. 5th ed. Boston: PWS-KENT; 1993. ISBN: 0-534-93219-3
  6. 6.Ciric IR. Using selected rates of residual vector decrease for solving iteratively large linear systems. In: Proceedings of the 19th International Symposium on Antenna Technology and Applied Electromagnetics (ANTEM). Winnipeg, Canada: IEEE; 2021. ISBN: 978-1-6654-0335-1
  7. 7.Ciric IR. Alternate minimizations for an approximate solution of general systems of linear algebraic equations. In: Proceedings of the 19th International Symposium on Antenna Technology and Applied Electromagnetics (ANTEM). Winnipeg, Canada: IEEE; 2021. ISBN: 978-1-6654-0335-1
  8. 8.Ciric IR. A geometric property of the functional associated with general systems of algebraic equations. In: Proceedings of the 12th International Symposium on Antenna Technology and Applied Electromagnetics and Canadian Radio Sciences Conference (ANTEM/URSI). Montréal, Canada: IEEE; 2006. pp. 311-315. ISBN: 978-0-9638425-1-7
  9. 9.Lanczos C. Applied Analysis. New York: Dover Publications, Inc.; 1988. ISBN: 0-486-65656-X (Paperback)
  10. 10.Hämmerlin G, Hoffmann K-H. Numerical Mathematics. New York: Springer-Verlag; 1991

Written By

Ioan R. Ciric

Submitted: January 23rd, 2022Reviewed: March 3rd, 2022Published: April 17th, 2022