Open access peer-reviewed chapter

A Shamanskii-Like Accelerated Scheme for Nonlinear Systems of Equations

By Ibrahim Mohammed Sulaiman, Mustafa Mamat and Umar Audu Omesa

Submitted: March 20th 2019Reviewed: June 4th 2019Published: May 13th 2020

DOI: 10.5772/intechopen.87246

Downloaded: 26

Abstract

Newton-type methods with diagonal update to the Jacobian matrix are regarded as one most efficient and low memory scheme for system of nonlinear equations. One of the main advantages of these methods is solving nonlinear system of equations having singular Fréchet derivative at the root. In this chapter, we present a Jacobian approximation to the Shamanskii method, to obtain a convergent and accelerated scheme for systems of nonlinear equations. Precisely, we will focus on the efficiency of our proposed method and compare the performance with other existing methods. Numerical examples illustrate the efficiency and the theoretical analysis of the proposed methods.

Keywords

  • Newton method
  • Shamanskii method
  • diagonal updating scheme
  • nonlinear equations
  • Jacobian matrix

1. Introduction

A large aspect of scientific and management problems is often formulated by obtaining the values of xof which the function evaluation of that variable is equal to zero [1]. The above description can be represented mathematically by the following system of nonlinear equations:

f1x1x2xn=0
f2x1x2xn=0E1
=
fnx1x2xn=0

where x1,x2,,xnRnare vectors and fiis nonlinear functions for i=1,2,,n. The above system of equations (1) can be written as

Fx=0E2

where F:RnRnis continuously differentiable in an open neighborhood of the solution x. These systems are seen as natural description of observed phenomenon of numerous real-life problems whose solutions are seen as an important goal in mathematical study. Recently, this area has been studied extensively [2, 3]. The most powerful techniques for handling nonlinear systems of equations are to linearize the equations and proceed to iterate on the linearized set of equations until an accurate solution is obtained [4]. This can be achieved by obtaining the derivative or gradient of the equations. Various scholars stress that the derivatives should be obtained analytically rather than using numerical approach. However, this is usually not always convenient and, in most cases, not even possible as equations may be generated simply by a computer algorithm [2]. For one variable problem, system of nonlinear equation defined in (2) represents a function F:RRwhere fis continuous in the interval fab.

Definition 1: Consider a system of equations f1,f2,,fn; the solution of this system in one variable, two variables, and nvariable is referred to as a point a1a2anRnsuch that f1a1a2an=f2a1a2an==fna1a2an=0.

In general, the problem to be considered is that for some function fx, one wishes to evaluate the derivative at some points x, i.e.,

Givenfx,Evaluate;deriv=dfdx

This often used to represent an instantaneous change of the function at some given points [5].

Definition 2: For a function fxthat is smooth, then there exists, at any point x, a vector of first-order partial derivative or gradient vector:

fx=fx1fx2...fxn=gx.

The Taylor’s series expansion of the function fxabout point x0is an ideal starting point for this discussion [1].

Definition 3: Let fbe a differentiable function; the Taylor’sfxaround a point ais the infinite sum:

fx=fa+faxa+fa2xa2+fa3!xa3+

However, continuous differentiable vector valued function does not satisfy the mean value theorem (MVT), an essential tool in calculus [6]. Hence, academics suggested the use of the following theorem to replace the mean valued theorem.

Theorem 1: Let F:RnRmbe continuously differentiable in an open convex set DRn. For any x,x+sD

Fx+sFx=01Jx+tssdtxx+sFzdz

Definition 4: Suppose F:RnRnis continuously differentiable at the point xRnand each component function f1,f2,,fmis also continuously differentiable at x; then the derivative of F xis defined as

JxF=f1x1f2x1fnx1f1x2f2x2fnx2f1xmf2xmfnxm

Most of the algorithms employ for obtaining the solution of Eq. (1) centered on approximating the Jacobian matrix which often provides a linear map Tx:RnRndefined by Eq. (3)

Tx=JxFxxRnE3

Also, if Fis differentiable at point x, then the affine function Ax=fx+Jxxis a good approximation to Fxnear x=xin such a way that

limxxFxFxJxxxxE4

where xx=x1x2+x2x2++xnx2.

If all the given component functions f1,f2,,fmof JxFare continuous, then we say the function Fis differentiable.

The most famous method for solving nonlinear systems of equations Fx=0is the Newton method which generates a sequence xkfrom any given initial point x0via the following:

xk+1=xkFxk1FxkE5

where Fxkis the Jacobian for Fxk. The above sequence Eq. (5) is said to converge quadratically to the solution xif x0is sufficiently near the solution point and the Jacobian Fxkis nonsingular [7, 8]. This convergent rate makes the method outstanding among other numerical methods. However, Jacobian evaluation and solving the linear system for the step sxn=Fxn1Fxnare expensive and time-consuming [9]. This led to the study of different variants of Newton methods for systems of nonlinear equations. One of the simplest and low-cost variants of the Newton method that almost entirely evades derivate evaluation at every iteration is the chord method. This scheme computes the Jacobian matrix Fx0once throughout the iteration process for finite dimensional problem as presented in Eq. (6),

xk+1=xkFx01FxkE6

The rate of convergence is linear and improves as the initial point gets better. Suppose x0is sufficiently chosen near solution point xand Fxis nonsingular; then, for some Kc>0, we have

xn+1xKcx0xxnxE7

The convergence theorems and proof of Eq. (7) can be referred to [9, 10]. Motivated by the excellent convergence of Newton method and low cost of Jacobian evaluation of chord method, a method due originally to Shamanskii [11, 12] that lies between Newton method and chord method was proposed and has been analyzed in Kelly [9, 13, 14, 15]. Other variants of Newton methods with different Jacobian approximation schemes include [9, 14, 16, 17, 18]. However, most of these methods require the computation and storage of the full or approximate Jacobian, which become very difficult and time-consuming as the dimension of systems increases [10, 19].

It would be worthwhile to construct a derivative-free approach and analyze with existing techniques [20, 21, 22]. The aim of this work is to derive a diagonal matrix for the approximate Jacobian of Shamanskii method by means of variational techniques. The expectation would be to reduce computational cost, storage, and CPU time of evaluating any problem. The proposed method works efficiently by combining the good convergence rate of Shamanskii method and the derivate free approach employed, and the results are very encouraging. The next section presents the Shamanskii method for nonlinear systems of equations.

2. Shamanskii method

It is known that the Newton method defined in Eq. (2) converges quadratically to xwhen the initial guess is sufficiently close to the root [7, 10, 19]. The major concern about this method is the evaluation and storage of the Jacobian matrix at every iteration [23]. A scheme that almost completely overcomes this is the chord method. This method factored the Jacobian matrix only once in the case of a finite dimensional problem, thereby reducing the evaluation cost of each iteration as in Eq. (3) and thereby degrading the convergence rate to linear [10].

Motivated by this, a method due originally to Shamanskii [11] was developed and analyzed by [7, 13, 14, 16, 24]. Starting with an initial approximation x0, this method uses the multiple pseudo-Newton approach as described below:

xk+12=xkFxk1FxkE8
xk+1=xk+12Fxk1Fxk+12E9

after little simplification, we have

xk+1=xkFxk1Fxk+FxkFxk1FxkE10

This method converges superlinearly with q-order of at least t+1when the initial approximation x0is sufficiently chosen near the solution point xand Fxis nonsingular. This implies that there exists Ks>0, such that

xn+1xKsxnxt+1E11

Combining Eq. (7) and the quadratic convergence of Newton method produces the convergence rate of the Shamanskii method as in Eq. (8). Thus, the balance is between the reduced evaluation cost of Fréchet derivative and Jacobian computation for Shamanskii method and Newton method rapid convergence. This low-cost derivative evaluation as well as the rapid convergence rate of several methods including the Shamanskii method has been studied and analyzed in [13, 15]. From the analysis, the researchers conclude that that Shamanskii method has shown superior performance compared to Newton method in terms of efficiency whenever work is measured in terms of function evaluations [9]. Also, if the value of tis sufficiently chosen, then, as the dimension increases, the performance of the Shamanskii method improves and thus reduces the limit of complexity of factoring the approximate Jacobian for two pseudo-Newton iterations [14]. Please refer to [15] for the proof of the convergence theorem described below.

Theorem 2 [15]: Let F:DRnRnconform hypotheses H12, H2, and H3. Then the solution point xis a point of attraction of the Shamanskii iteration, i.e., Eq. (10), and this method possesses at least cubic order of convergence.

3. Diagonal updating scheme for solving nonlinear systems

Evaluation or inversion of the Jacobian matrix at every iteration or after few iterations does not seem relevant even though the computational cost has generally been reduced as in Shamanskii method [14, 25, 26, 27, 28]. As a matter of fact, it can be easily shown that by adding a diagonal updating scheme to a method, we would have a new low memory iterative approach which would approximate the Jacobian Fxkinto nonsingular diagonal matrix that can be updated in every iteration [29, 30, 31]. Indeed, using the Shamanskii procedure, the proposed method avoids the main complexity of the Newton-type methods by reusing the evaluated Jacobian during the iteration process. This is the basic idea of the Shamanskii-like method which is described as follows.

Given an initial approximation x0, we compute Eq. (2) to obtain the Jacobian Fxkand present a diagonal approximation to the Jacobian say Dkas follows:

Fxk+1Dk+1E12

Suppose sk=xk+1xkand yk=Fxk+1Fxk; by mean value theorem (MVT), we have

Dk+1skykE13

Substituting Eq. (12) in Eq. (13), we have

FxkskykE14

Since Dk+1is the update of diagonal matrix Dk, let us assume Dk+1satisfy the weak secant equation:

skTDk+1sk=skTykE15

which would be used to minimize the deviation between Dk+1and Dkunder some norms. The updated formula for Dkfollows after the theorem below:

Theorem 3: Suppose Dk+1is the update of the diagonal matrix Dkand k=Dk+1Dk, sk0. Consider the problem

min12kF2E16

such that Eq. (15) holds and .Fdenotes the Frobenius norm. From Eq. (16), we have the following solution also regarded as the optimal solution:

k=skTykskTDk+1sktrΩk2ΩkE17

where Ωk=diagsk12sk22skn2, i=1nski4=trΩk2, and Tris the trace operation.

Proof: It is known that the objective function and the constraint of Eq. (16) are convex; thus, we intend to use its Lagrangian function to obtain the unique solution as follows:

Lkμ=12kF2+μskTkskskTykskTDk+1skE18

where μis the corresponding Lagrangian multiplier. Simplifying Eq. (18), we have

μ=skTykskTDk+1ski=1nski4E19

and

ki=skTykskTDk+1ski=1nski4ski2i=1,2,..,nE20

Also, for diagonal matrix Dk, the element of the diagonal component is given asDki, and the ithcomponent of the vector skis ski. Then Ωk=diagsk12sk22skn2,andi=1nski4=trΩk2. To complete the proof, we rewrite Eq. (20) as follows:

k=skTykskTDk+1sktrΩk2Ωk.E21

This completes the proof.

Now, from the above description of the theorem, we deduce that the best possible diagonal update Dk+1is as follows:

Dk+1=Dk+skTykskTDk+1sktrΩk2ΩkE22

However, for possibly small skand trΩk, we need to define a condition that would be applied for such cases. To this end, we require that sks1for some chosen small s1>0. Otherwise, we set the updated diagonal Dk+1=Dkwhere Dk+1is defined as

Dk+1=Dk+skTykskTDk+1sktrΩk2Ωk;skϵ1Dk;OtherwiseE23

Thus, the proposed accelerated method is described as follows:

xk+1=xkDk1Fxk+FxkDk1FxkE24

The performance of this proposed method would be tested on well-known benchmark problems employed by researchers on existing methods. This would be carried out using a designed computer code for its algorithm. The problems could be artificial or real-life problems. The artificial problems check the performance of any algorithm in situation such as point of singularity, function with many solutions, and null space effect as presented in Figures 13 [7, 32].

Figure 1.

Functions with a huge number of significant local optima.

Figure 2.

Functions with significant null space.

Figure 3.

Essentially unimodal function.

While the real-life problems emerge from fields such as chemistry, engineering, management, etc., the real-life problems often contain large data or complex algebraic expression which makes it difficult to solve.

4. Numerical results

This section demonstrates the proposed method and illustrates its advantages on some benchmark problems with dimensions ranging from 25 to 1,000 variables. These include problems with restrictions such as singular Jacobian or problems with only one point of singularity. To evaluate the performance of the proposed diagonal updating Shamanskii method (DUSM), we employ some tools by Dolan and Moré [33] and compare the performance with two classical Newton-type methods based on the number of iterations and CPU time in seconds. The methods include:

  1. The Newton method (NM)

  2. The Shamanskii method (SM)

These tools are used to represent the efficiency, robustness, and numerical comparisons of different algorithms. Suppose there exist nssolvers and npproblems; for each problem pand solver s,they define:

tp,s=computing time needed to solveaproblembysolverthe number of iteration orCPUtime

Requiring a baseline for comparisons, they compared the performance on problem pby solver swith the best performance by any solver for this problem using the performance ratio:

rp,s=tp,smintp,s:sS

We suppose that parameter rmrp,sfor all p,sis chosen and rp,s=rMif and only if solver sdoes not solve problem p. The performance of solvers son any given problem might be of interest, but because we would prefer obtaining the overall assessment of the performance of the solver, then it was defined as

pst=1npsizepP:rp,st.

Thus, pstwas the probability for solver sSthat a performance ratio rp,swas within a factor tRof the best possible ratio. Then, function pswas the cumulative distribution function for the performance ratio. The performance profile ps:R01for a solver was nondecreasing, piecewise, and continuous from right. The value of ps1is the probability that the solver will win over the rest of the solvers. In general, a solver with high value of pτor at the top right of the figure is preferable or represents the best solver.

All problems considered in this research are solved using MATLAB (R 2015a) subroutine programming [37]. This was run on an Intel® Core™ i5-2410M CPU @ 2.30 GHz processor, 4GB for RAM memory and Windows 7 Professional operating system. The termination condition is set as

sk+Fxk106

and the program has been designed to terminate whenever:

  • The number of iterations exceeds 500, and no point of xksatisfies the termination condition.

  • The CPU time in seconds reaches 500.

  • Insufficient memory to initiate the run.

At the point of failure due to any of the above conditions as in the tabulated results, it is assumed the number of iteration and CPU time is zero and thus that point has been denoted by “.” The following are the details of the standard test problems, the initial points used, and the exact solutions for systems of nonlinear equations.

Problem 1 [31]: System of nnonlinear equations

Fix=1xi2+xi1+xixn2xn1xn2
i=1,2,3,,n,x0=0.3,030.3

Problem 2 [34]: Systems of nnonlinear equations

Fix=xi2cosxi1
i=1,2,3,,n,x0=0.20.20.2

Problem 3 [31]: Structured exponential function

Fix=exi1
Fnx=xn0.1xn2
i=1,2,3,,n,x0=0.050.050.05

Problem 4 [35]: Extended trigonometric of Byeong-Chun

Fix=cosxi211
i=1,2,3,,n,x0=0.060.060.06

Problem 5 [36]: Extended spare system of Byeong

Fix=xi2xi2
i=1,2,3,,n,x0=1.111.11.1

Table 1 shows the number of iterations (NI) and CPU time for Newton method (NM), Shamanskii method (SM), and the proposed diagonal updating method (DUSM), respectively. The performance of these methods was analyzed via storage locations and execution time. It can be observed that the proposed DUSM was able to solve the test problems perfectly, while NM and SM fail at some points due to the matrix being singular to working precision. This shows that the diagonal scheme employed has provided an option in the case of singularity, thereby reducing the computational cost of the classical Newton-type methods.

ProblemDimNMDUSMSM
NICPUNICPUNICPU
125130.01610280.034777130.015999
22560.00952270.02823170.010412
325**40.023766**
425160.019679170.077072220.022889
52540.006605160.06175040.005761
150130.03299880.090310130.032271
250100.02213470.08978570.017036
35040.01035040.05289940.010238
450300.054640170.228077230.041569
55040.012361160.20126240.010735
1100130.07356580.339333130.066363
2100100.05407570.29200170.044512
3100**40.175300**
4100150.075073180.770165250.118170
510040.029221170.75555640.023154
11000131.868606827.171776132.042222
21000101.444533724.29563271.045329
31000**427.1250**
41000526.7575331963.981376395.138997
5100040.6101451862.36414340.612590

Table 1.

Numerical comparison of NM, DUSM, and SM.

5. Conclusion

This chapter proposes a diagonal updating formula for systems of nonlinear equations which attributes to reduction in Jacobian evaluation cost. By computational experiments, we reach the conclusion that the proposed scheme is reliable and efficient and reduces Jacobian computational cost during the iteration process. Meanwhile, the proposed scheme is superior compared to the result of the classical and existing numerical methods for solving systems of equations.

How to cite and reference

Link to this chapter Copy to clipboard

Cite this chapter Copy to clipboard

Ibrahim Mohammed Sulaiman, Mustafa Mamat and Umar Audu Omesa (May 13th 2020). A Shamanskii-Like Accelerated Scheme for Nonlinear Systems of Equations, Nonlinear Systems -Theoretical Aspects and Recent Applications, Walter Legnani and Terry E. Moschandreou, IntechOpen, DOI: 10.5772/intechopen.87246. Available from:

chapter statistics

26total chapter downloads

More statistics for editors and authors

Login to your personal dashboard for more detailed statistics on your publications.

Access personal reporting

Related Content

This Book

Next chapter

Modified Moving Least Squares Method for Two-Dimensional Linear and Nonlinear Systems of Integral Equations

By Massoumeh Poura’bd Rokn Saraei and Mashaallah Matinfar

Related Book

First chapter

Applications of 2D Padé Approximants in Nonlinear Shell Theory: Stability Calculation and Experimental Justification

By Igor Andrianov, Jan Awrejcewicz and Victor Olevs’kyy

We are IntechOpen, the world's leading publisher of Open Access books. Built by scientists, for scientists. Our readership spans scientists, professors, researchers, librarians, and students, as well as business professionals. We share our knowledge and peer-reveiwed research papers with libraries, scientific and engineering societies, and also work with corporate R&D departments and government entities.

More About Us