Open access peer-reviewed chapter

A Shamanskii-Like Accelerated Scheme for Nonlinear Systems of Equations

Written By

Ibrahim Mohammed Sulaiman, Mustafa Mamat and Umar Audu Omesa

Submitted: 20 March 2019 Reviewed: 04 June 2019 Published: 13 May 2020

DOI: 10.5772/intechopen.87246

From the Edited Volume

Nonlinear Systems -Theoretical Aspects and Recent Applications

Edited by Walter Legnani and Terry E. Moschandreou

Chapter metrics overview

923 Chapter Downloads

View Full Metrics

Abstract

Newton-type methods with diagonal update to the Jacobian matrix are regarded as one most efficient and low memory scheme for system of nonlinear equations. One of the main advantages of these methods is solving nonlinear system of equations having singular Fréchet derivative at the root. In this chapter, we present a Jacobian approximation to the Shamanskii method, to obtain a convergent and accelerated scheme for systems of nonlinear equations. Precisely, we will focus on the efficiency of our proposed method and compare the performance with other existing methods. Numerical examples illustrate the efficiency and the theoretical analysis of the proposed methods.

Keywords

  • Newton method
  • Shamanskii method
  • diagonal updating scheme
  • nonlinear equations
  • Jacobian matrix

1. Introduction

A large aspect of scientific and management problems is often formulated by obtaining the values of x of which the function evaluation of that variable is equal to zero [1]. The above description can be represented mathematically by the following system of nonlinear equations:

f1x1x2xn=0
f2x1x2xn=0E1
=
fnx1x2xn=0

where x1,x2,,xnRn are vectors and fi is nonlinear functions for i=1,2,,n. The above system of equations (1) can be written as

Fx=0E2

where F:RnRn is continuously differentiable in an open neighborhood of the solution x. These systems are seen as natural description of observed phenomenon of numerous real-life problems whose solutions are seen as an important goal in mathematical study. Recently, this area has been studied extensively [2, 3]. The most powerful techniques for handling nonlinear systems of equations are to linearize the equations and proceed to iterate on the linearized set of equations until an accurate solution is obtained [4]. This can be achieved by obtaining the derivative or gradient of the equations. Various scholars stress that the derivatives should be obtained analytically rather than using numerical approach. However, this is usually not always convenient and, in most cases, not even possible as equations may be generated simply by a computer algorithm [2]. For one variable problem, system of nonlinear equation defined in (2) represents a function F:RR where f is continuous in the interval fab.

Definition 1: Consider a system of equations f1,f2,,fn; the solution of this system in one variable, two variables, and n variable is referred to as a point a1a2anRn such that f1a1a2an=f2a1a2an==fna1a2an=0.

In general, the problem to be considered is that for some function fx, one wishes to evaluate the derivative at some points x, i.e.,

Givenfx,Evaluate;deriv=dfdx

This often used to represent an instantaneous change of the function at some given points [5].

Definition 2: For a function fx that is smooth, then there exists, at any point x, a vector of first-order partial derivative or gradient vector:

fx=fx1fx2...fxn=gx.

The Taylor’s series expansion of the function fx about point x0 is an ideal starting point for this discussion [1].

Definition 3: Let f be a differentiable function; the Taylor’sfx around a point a is the infinite sum:

fx=fa+faxa+fa2xa2+fa3!xa3+

However, continuous differentiable vector valued function does not satisfy the mean value theorem (MVT), an essential tool in calculus [6]. Hence, academics suggested the use of the following theorem to replace the mean valued theorem.

Theorem 1: Let F:RnRm be continuously differentiable in an open convex set DRn. For any x,x+sD

Fx+sFx=01Jx+tssdtxx+sFzdz

Definition 4: Suppose F:RnRn is continuously differentiable at the point xRn and each component function f1,f2,,fm is also continuously differentiable at x; then the derivative of F x is defined as

JxF=f1x1f2x1fnx1f1x2f2x2fnx2f1xmf2xmfnxm

Most of the algorithms employ for obtaining the solution of Eq. (1) centered on approximating the Jacobian matrix which often provides a linear map Tx:RnRn defined by Eq. (3)

Tx=JxFxxRnE3

Also, if F is differentiable at point x, then the affine function Ax=fx+Jxx is a good approximation to Fx near x=x in such a way that

limxxFxFxJxxxxE4

where xx=x1x2+x2x2++xnx2.

If all the given component functions f1,f2,,fm of JxF are continuous, then we say the function F is differentiable.

The most famous method for solving nonlinear systems of equations Fx=0 is the Newton method which generates a sequence xk from any given initial point x0 via the following:

xk+1=xkFxk1FxkE5

where Fxk is the Jacobian for Fxk. The above sequence Eq. (5) is said to converge quadratically to the solution x if x0 is sufficiently near the solution point and the Jacobian Fxk is nonsingular [7, 8]. This convergent rate makes the method outstanding among other numerical methods. However, Jacobian evaluation and solving the linear system for the step sxn=Fxn1Fxn are expensive and time-consuming [9]. This led to the study of different variants of Newton methods for systems of nonlinear equations. One of the simplest and low-cost variants of the Newton method that almost entirely evades derivate evaluation at every iteration is the chord method. This scheme computes the Jacobian matrix Fx0 once throughout the iteration process for finite dimensional problem as presented in Eq. (6),

xk+1=xkFx01FxkE6

The rate of convergence is linear and improves as the initial point gets better. Suppose x0 is sufficiently chosen near solution point x and Fx is nonsingular; then, for some Kc>0, we have

xn+1xKcx0xxnxE7

The convergence theorems and proof of Eq. (7) can be referred to [9, 10]. Motivated by the excellent convergence of Newton method and low cost of Jacobian evaluation of chord method, a method due originally to Shamanskii [11, 12] that lies between Newton method and chord method was proposed and has been analyzed in Kelly [9, 13, 14, 15]. Other variants of Newton methods with different Jacobian approximation schemes include [9, 14, 16, 17, 18]. However, most of these methods require the computation and storage of the full or approximate Jacobian, which become very difficult and time-consuming as the dimension of systems increases [10, 19].

It would be worthwhile to construct a derivative-free approach and analyze with existing techniques [20, 21, 22]. The aim of this work is to derive a diagonal matrix for the approximate Jacobian of Shamanskii method by means of variational techniques. The expectation would be to reduce computational cost, storage, and CPU time of evaluating any problem. The proposed method works efficiently by combining the good convergence rate of Shamanskii method and the derivate free approach employed, and the results are very encouraging. The next section presents the Shamanskii method for nonlinear systems of equations.

Advertisement

2. Shamanskii method

It is known that the Newton method defined in Eq. (2) converges quadratically to x when the initial guess is sufficiently close to the root [7, 10, 19]. The major concern about this method is the evaluation and storage of the Jacobian matrix at every iteration [23]. A scheme that almost completely overcomes this is the chord method. This method factored the Jacobian matrix only once in the case of a finite dimensional problem, thereby reducing the evaluation cost of each iteration as in Eq. (3) and thereby degrading the convergence rate to linear [10].

Motivated by this, a method due originally to Shamanskii [11] was developed and analyzed by [7, 13, 14, 16, 24]. Starting with an initial approximation x0, this method uses the multiple pseudo-Newton approach as described below:

xk+12=xkFxk1FxkE8
xk+1=xk+12Fxk1Fxk+12E9

after little simplification, we have

xk+1=xkFxk1Fxk+FxkFxk1FxkE10

This method converges superlinearly with q-order of at least t+1 when the initial approximation x0 is sufficiently chosen near the solution point x and Fx is nonsingular. This implies that there exists Ks>0, such that

xn+1xKsxnxt+1E11

Combining Eq. (7) and the quadratic convergence of Newton method produces the convergence rate of the Shamanskii method as in Eq. (8). Thus, the balance is between the reduced evaluation cost of Fréchet derivative and Jacobian computation for Shamanskii method and Newton method rapid convergence. This low-cost derivative evaluation as well as the rapid convergence rate of several methods including the Shamanskii method has been studied and analyzed in [13, 15]. From the analysis, the researchers conclude that that Shamanskii method has shown superior performance compared to Newton method in terms of efficiency whenever work is measured in terms of function evaluations [9]. Also, if the value of t is sufficiently chosen, then, as the dimension increases, the performance of the Shamanskii method improves and thus reduces the limit of complexity of factoring the approximate Jacobian for two pseudo-Newton iterations [14]. Please refer to [15] for the proof of the convergence theorem described below.

Theorem 2 [15]: Let F:DRnRn conform hypotheses H12, H2, and H3. Then the solution point x is a point of attraction of the Shamanskii iteration, i.e., Eq. (10), and this method possesses at least cubic order of convergence.

Advertisement

3. Diagonal updating scheme for solving nonlinear systems

Evaluation or inversion of the Jacobian matrix at every iteration or after few iterations does not seem relevant even though the computational cost has generally been reduced as in Shamanskii method [14, 25, 26, 27, 28]. As a matter of fact, it can be easily shown that by adding a diagonal updating scheme to a method, we would have a new low memory iterative approach which would approximate the Jacobian Fxk into nonsingular diagonal matrix that can be updated in every iteration [29, 30, 31]. Indeed, using the Shamanskii procedure, the proposed method avoids the main complexity of the Newton-type methods by reusing the evaluated Jacobian during the iteration process. This is the basic idea of the Shamanskii-like method which is described as follows.

Given an initial approximation x0, we compute Eq. (2) to obtain the Jacobian Fxk and present a diagonal approximation to the Jacobian say Dk as follows:

Fxk+1Dk+1E12

Suppose sk=xk+1xk and yk=Fxk+1Fxk; by mean value theorem (MVT), we have

Dk+1skykE13

Substituting Eq. (12) in Eq. (13), we have

FxkskykE14

Since Dk+1 is the update of diagonal matrix Dk, let us assume Dk+1 satisfy the weak secant equation:

skTDk+1sk=skTykE15

which would be used to minimize the deviation between Dk+1 and Dk under some norms. The updated formula for Dk follows after the theorem below:

Theorem 3: Suppose Dk+1 is the update of the diagonal matrix Dk and k=Dk+1Dk, sk0. Consider the problem

min12kF2E16

such that Eq. (15) holds and .F denotes the Frobenius norm. From Eq. (16), we have the following solution also regarded as the optimal solution:

k=skTykskTDk+1sktrΩk2ΩkE17

where Ωk=diagsk12sk22skn2, i=1nski4=trΩk2, and Tr is the trace operation.

Proof: It is known that the objective function and the constraint of Eq. (16) are convex; thus, we intend to use its Lagrangian function to obtain the unique solution as follows:

Lkμ=12kF2+μskTkskskTykskTDk+1skE18

where μ is the corresponding Lagrangian multiplier. Simplifying Eq. (18), we have

μ=skTykskTDk+1ski=1nski4E19

and

ki=skTykskTDk+1ski=1nski4ski2i=1,2,..,nE20

Also, for diagonal matrix Dk, the element of the diagonal component is given asDki, and the ith component of the vector sk is ski. Then Ωk=diagsk12sk22skn2,andi=1nski4=trΩk2. To complete the proof, we rewrite Eq. (20) as follows:

k=skTykskTDk+1sktrΩk2Ωk.E21

This completes the proof.

Now, from the above description of the theorem, we deduce that the best possible diagonal update Dk+1 is as follows:

Dk+1=Dk+skTykskTDk+1sktrΩk2ΩkE22

However, for possibly small sk and trΩk, we need to define a condition that would be applied for such cases. To this end, we require that sks1 for some chosen small s1>0. Otherwise, we set the updated diagonal Dk+1=Dk where Dk+1 is defined as

Dk+1=Dk+skTykskTDk+1sktrΩk2Ωk;skϵ1Dk;OtherwiseE23

Thus, the proposed accelerated method is described as follows:

xk+1=xkDk1Fxk+FxkDk1FxkE24

The performance of this proposed method would be tested on well-known benchmark problems employed by researchers on existing methods. This would be carried out using a designed computer code for its algorithm. The problems could be artificial or real-life problems. The artificial problems check the performance of any algorithm in situation such as point of singularity, function with many solutions, and null space effect as presented in Figures 13 [7, 32].

Figure 1.

Functions with a huge number of significant local optima.

Figure 2.

Functions with significant null space.

Figure 3.

Essentially unimodal function.

While the real-life problems emerge from fields such as chemistry, engineering, management, etc., the real-life problems often contain large data or complex algebraic expression which makes it difficult to solve.

Advertisement

4. Numerical results

This section demonstrates the proposed method and illustrates its advantages on some benchmark problems with dimensions ranging from 25 to 1,000 variables. These include problems with restrictions such as singular Jacobian or problems with only one point of singularity. To evaluate the performance of the proposed diagonal updating Shamanskii method (DUSM), we employ some tools by Dolan and Moré [33] and compare the performance with two classical Newton-type methods based on the number of iterations and CPU time in seconds. The methods include:

  1. The Newton method (NM)

  2. The Shamanskii method (SM)

These tools are used to represent the efficiency, robustness, and numerical comparisons of different algorithms. Suppose there exist ns solvers and np problems; for each problem p and solver s,they define:

tp,s=computing time needed to solveaproblembysolverthe number of iteration orCPUtime

Requiring a baseline for comparisons, they compared the performance on problem p by solver s with the best performance by any solver for this problem using the performance ratio:

rp,s=tp,smintp,s:sS

We suppose that parameter rmrp,s for all p,s is chosen and rp,s=rM if and only if solver s does not solve problem p. The performance of solvers s on any given problem might be of interest, but because we would prefer obtaining the overall assessment of the performance of the solver, then it was defined as

pst=1npsizepP:rp,st.

Thus, pst was the probability for solver sS that a performance ratio rp,s was within a factor tR of the best possible ratio. Then, function ps was the cumulative distribution function for the performance ratio. The performance profile ps:R01 for a solver was nondecreasing, piecewise, and continuous from right. The value of ps1 is the probability that the solver will win over the rest of the solvers. In general, a solver with high value of pτ or at the top right of the figure is preferable or represents the best solver.

All problems considered in this research are solved using MATLAB (R 2015a) subroutine programming [37]. This was run on an Intel® Core™ i5-2410M CPU @ 2.30 GHz processor, 4GB for RAM memory and Windows 7 Professional operating system. The termination condition is set as

sk+Fxk106

and the program has been designed to terminate whenever:

  • The number of iterations exceeds 500, and no point of xk satisfies the termination condition.

  • The CPU time in seconds reaches 500.

  • Insufficient memory to initiate the run.

At the point of failure due to any of the above conditions as in the tabulated results, it is assumed the number of iteration and CPU time is zero and thus that point has been denoted by “.” The following are the details of the standard test problems, the initial points used, and the exact solutions for systems of nonlinear equations.

Problem 1 [31]: System of n nonlinear equations

Fix=1xi2+xi1+xixn2xn1xn2
i=1,2,3,,n,x0=0.3,030.3

Problem 2 [34]: Systems of n nonlinear equations

Fix=xi2cosxi1
i=1,2,3,,n,x0=0.20.20.2

Problem 3 [31]: Structured exponential function

Fix=exi1
Fnx=xn0.1xn2
i=1,2,3,,n,x0=0.050.050.05

Problem 4 [35]: Extended trigonometric of Byeong-Chun

Fix=cosxi211
i=1,2,3,,n,x0=0.060.060.06

Problem 5 [36]: Extended spare system of Byeong

Fix=xi2xi2
i=1,2,3,,n,x0=1.111.11.1

Table 1 shows the number of iterations (NI) and CPU time for Newton method (NM), Shamanskii method (SM), and the proposed diagonal updating method (DUSM), respectively. The performance of these methods was analyzed via storage locations and execution time. It can be observed that the proposed DUSM was able to solve the test problems perfectly, while NM and SM fail at some points due to the matrix being singular to working precision. This shows that the diagonal scheme employed has provided an option in the case of singularity, thereby reducing the computational cost of the classical Newton-type methods.

ProblemDimNMDUSMSM
NICPUNICPUNICPU
125130.01610280.034777130.015999
22560.00952270.02823170.010412
325**40.023766**
425160.019679170.077072220.022889
52540.006605160.06175040.005761
150130.03299880.090310130.032271
250100.02213470.08978570.017036
35040.01035040.05289940.010238
450300.054640170.228077230.041569
55040.012361160.20126240.010735
1100130.07356580.339333130.066363
2100100.05407570.29200170.044512
3100**40.175300**
4100150.075073180.770165250.118170
510040.029221170.75555640.023154
11000131.868606827.171776132.042222
21000101.444533724.29563271.045329
31000**427.1250**
41000526.7575331963.981376395.138997
5100040.6101451862.36414340.612590

Table 1.

Numerical comparison of NM, DUSM, and SM.

Advertisement

5. Conclusion

This chapter proposes a diagonal updating formula for systems of nonlinear equations which attributes to reduction in Jacobian evaluation cost. By computational experiments, we reach the conclusion that the proposed scheme is reliable and efficient and reduces Jacobian computational cost during the iteration process. Meanwhile, the proposed scheme is superior compared to the result of the classical and existing numerical methods for solving systems of equations.

References

  1. 1. Rainer K. Numerical Analysis. New York: Springer Science+Business Media, LLC; 1998
  2. 2. Sulaiman IM. New iterative methods for solving fuzzy and dual fuzzy nonlinear equations [PhD thesis]. Malaysia: Faculty of Informatics and Computing, Universiti Sultan Zainal Abidin; 2018
  3. 3. Wenyu S, Ya-Xiang Y. Optimization Theory and Methods, Springer Optimization and Its Applications. Boston, MA: Springer; 2006
  4. 4. John RH. Numerical Methods for Nonlinear Engineering Models. Netherlands: Springer; 2009
  5. 5. Burden RL, Faires JD. Numerical Analysis. 8th ed. USA: Thomson; 2005
  6. 6. Wright SJ, Nocedal J. Numerical Optimization. 2nd ed. Berlin, Germany: Springer; 1999
  7. 7. Dennis JE Jr, Schnabel RB. Numerical Method for Unconstrained Optimization and Nonlinear Equations. Houston, Texas: SIAM; 1996
  8. 8. Griva I, Nash SG, Sofer A. Linear and Nonlinear Optimization. 2nd ed. Philadelphia: SIAM; 2009
  9. 9. Kelley CT. A Shamanskii-like acceleration scheme for nonlinear equations at singular roots. Mathematics of Computation. 1986;47:609-623
  10. 10. Kelley CT. Iterative Methods for Linear and Nonlinear Equations. Philadelphia, PA: SIAM; 1995
  11. 11. Shamanskii VE. A modification of Newton’s method. Ukrains’kyi Matematychnyi Zhurnal. 1967;19:133-138
  12. 12. Shamanskii VE. On a realization of Newton’s method on electronic computers. Ukrains’kyi Matematychnyi Zhurnal. 1966;18(6):135-140
  13. 13. Traub JF. Iterative Methods for the Solution of Equations. Englewood Cliffs: Prentice-Hall; 1964
  14. 14. Kchouk B, Dussault J. The Chebyshev–Shamanskii method for solving systems of nonlinear equations. Journal of Optimization Theory and Applications. 2013;157:148-167
  15. 15. Ortega JM, Rheinboldt WC. Iterative Solution of Nonlinear Equations in Several Variables. New York: Academic Press; 1970
  16. 16. Brent RP. Some efficient algorithms for solving systems of nonlinear equations. Journal of Nucleic Acids. 1973;10(2):327-344
  17. 17. Waziri MY, Leong WJ, Mamat M, Moyi AU. Two-step derivative-free diagonally Newton’s method for large-scale nonlinear equations. World Applied Sciences Journal. 2013;21:86-94. DOI: 10.5829/idosi.wasj.2013.21.am.2045
  18. 18. Broyden CG. A class of methods for solving nonlinear simultaneous equations. Mathematics of Computation. 1965;19(92):577-593
  19. 19. Chong EKP, Zak SH. An Introduction to Optimization, Wiley Series in Discrete Mathematics and Optimization. New York: John Wiley and Sons; 2013
  20. 20. Jose LH, Eulalia M, Juan RM. Modified Newton’s method for systems of nonlinear equations with singular Jacobian. Journal of Computational and Applied Mathematics. 2009;224:77-83
  21. 21. Leong WJ, Hassan MA, Waziri MY. A matrix-free quasi-Newton method for solving large-scale nonlinear systems. Computational and Applied Mathematics. 2011;625:2354-2363
  22. 22. Natasa K, Zorana L. Newton-like methods with modification of the right-hand side vector. Mathematics of Computation. 2002;71:237-250
  23. 23. Waziri MY, Leong WJ, Hassan MA, Monsi M. A new Newton’s method with diagonal Jacobian approximation for systems of nonlinear equations. Journal of Mathematics and Statistics. 2010;6(3):246-252
  24. 24. Lampariello F, Sciandrone M. Global convergence technique for the Newton method with periodic Hessian evaluation. Journal of Optimization Theory and Applications. 2001;111(2):341-358
  25. 25. Sulaiman IM, Mamat M, Nurnadiah Z, Puspa LG. Solving dual fuzzy nonlinear equations via Shamanskii method. International Journal of Engineering & Technology. 2018;7(3.28):89-91
  26. 26. Ypma TJ. Historical development of the Newton-Raphson method. SIAM Review. 1995;37(4):531-551
  27. 27. Hao L, Qin N. Incomplete Jacobian Newton method for nonlinear equation. Computers and Mathematics with Applications. 2008;56(1):218-227
  28. 28. Chency E, Kincaid D. Numerical Mathematics and Computing. Asia: Nelson Education, Cengage Learning; 2012
  29. 29. Sulaiman IM, Mamat M, Afendee MM, Waziri MY. Diagonal updating Shamanskii-like method for solving singular fuzzy nonlinear equation. Far East Journal of Mathematical Sciences. 2017;103(10):1619-1629
  30. 30. Waziri MY, Leong WJ, Hassan MA, Monsi M. Jacobian-free Newton’s method for systems of nonlinear equations. Journal of Numerical Mathematics and Stochastics. 2010;2(1):54-63
  31. 31. Waziri MY, Abdulmajid Z. An improved diagonal Jacobian approximation via a quasi-Cauchy condition for solving large-scale systems of nonlinear equations. Journal of Applied Mathematics. 2013;3:1-6
  32. 32. Andrei N. An unconstrained optimization test functions collection. Advanced Modeling and Optimization. 2008;10:147-161
  33. 33. Dolan E, Moré JJ. Benchmarking optimization software with performance profiles. Mathematical Programming. 2002;91(2):201-213
  34. 34. Hafiz MA, Muhammad SMB. An efficient two-step iterative method for solving system of nonlinear equation. Journal of Mathematics Research. 2012;4(4):28-34
  35. 35. Shin BC, Darvishi M, Kim CH. A comparison of the Newton-Krylov method with high order newton-like methods to solve nonlinear systems. Applied Mathematics and Computation. 2010;217(7):3190-3198
  36. 36. Mamat M, Muhammad K, Waziri MY. Trapezoidal Broyden’s method for systems of nonlinear equations. Applied Mathematical Sciences. 2014;8(6):54-63
  37. 37. Sulaiman I, Mamat M, Waziri MY, Umar AO, et al. Journal of Advanced Research in Modelling and Simulations. 2018;1(1):13-18

Written By

Ibrahim Mohammed Sulaiman, Mustafa Mamat and Umar Audu Omesa

Submitted: 20 March 2019 Reviewed: 04 June 2019 Published: 13 May 2020