A Shamanskii-Like Accelerated Scheme for Nonlinear Systems of Equations

Newton-type methods with diagonal update to the Jacobian matrix are regarded as one most efficient and low memory scheme for system of nonlinear equations. One of the main advantages of these methods is solving nonlinear system of equations having singular Fréchet derivative at the root. In this chapter, we present a Jacobian approximation to the Shamanskii method, to obtain a convergent and accelerated scheme for systems of nonlinear equations. Precisely, we will focus on the efficiency of our proposed method and compare the performance with other existing methods. Numerical examples illustrate the efficiency and the theoretical analysis of the proposed methods.


Introduction
A large aspect of scientific and management problems is often formulated by obtaining the values of x of which the function evaluation of that variable is equal to zero [1]. The above description can be represented mathematically by the following system of nonlinear equations: ⋮⋮⋮ ¼ ⋮ f n x 1 ; x 2 ; …; x n ðÞ ¼ 0 where x 1 ,x 2 , …,x n ∈ R n are vectors and f i is nonlinear functions for i ¼ 1, 2, …,n. The above system of equations (1) can be written as where F : R n ! R n is continuously differentiable in an open neighborhood of the solution x * . These systems are seen as natural description of observed phenomenon of numerous real-life problems whose solutions are seen as an important goal in mathematical study. Recently, this area has been studied extensively [2,3]. The most powerful techniques for handling nonlinear systems of equations are to linearize the equations and proceed to iterate on the linearized set of equations until an accurate solution is obtained [4]. This can be achieved by obtaining the derivative or gradient of the equations. Various scholars stress that the derivatives should be obtained analytically rather than using numerical approach. However, this is usually not always convenient and, in most cases, not even possible as equations may be generated simply by a computer algorithm [2]. For one variable problem, system of nonlinear equation defined in (2) represents a function F : R ! R where f is continuous in the interval f ∈ a; b ½ . Definition 1: Consider a system of equations f 1 ,f 2 , …,f n ; the solution of this system in one variable, two variables, and n variable is referred to as a point a 1 ; a 2 ; …; a n ðÞ ∈ R n such that f 1 a 1 ; a 2 ; …; a n ðÞ ¼ f 2 a 1 ; a 2 ; …; a n ðÞ ¼ … ¼ f n a 1 ; a 2 ; …; a n ðÞ ¼ 0.
In general, the problem to be considered is that for some function fx ðÞ , one wishes to evaluate the derivative at some points x, i.e., Given fx ðÞ , Evaluate; deriv ¼ df dx This often used to represent an instantaneous change of the function at some given points [5]. The Taylor's series expansion of the function fx ðÞabout point x 0 is an ideal starting point for this discussion [1].
Definition 3: Let f be a differentiable function; the Taylor's fx ðÞaround a point a is the infinite sum: However, continuous differentiable vector valued function does not satisfy the mean value theorem (MVT), an essential tool in calculus [6]. Hence, academics suggested the use of the following theorem to replace the mean valued theorem.
Most of the algorithms employ for obtaining the solution of Eq. (1) centered on approximating the Jacobian matrix which often provides a linear map Tx ðÞ: R n ! R n defined by Eq. (3) q . If all the given component functions ðÞ are continuous, then we say the function F is differentiable.
The most famous method for solving nonlinear systems of equations Fx ðÞ¼0is the Newton method which generates a sequence x k fg from any given initial point x 0 via the following: where F 0 x k ðÞ is the Jacobian for Fx k ðÞ . The above sequence Eq. (5) is said to converge quadratically to the solution x * if x 0 is sufficiently near the solution point and the Jacobian F 0 x k ðÞ is nonsingular [7,8]. This convergent rate makes the method outstanding among other numerical methods. However, Jacobian evaluation and solving the linear system for the step sx n ðÞ ¼ À F 0 x n ðÞ À1 Fx n ðÞ are expensive and time-consuming [9]. This led to the study of different variants of Newton methods for systems of nonlinear equations. One of the simplest and low-cost variants of the Newton method that almost entirely evades derivate evaluation at every iteration is the chord method. This scheme computes the Jacobian matrix F 0 x 0 ðÞ once throughout the iteration process for finite dimensional problem as presented in Eq. (6), The rate of convergence is linear and improves as the initial point gets better. Suppose x 0 is sufficiently chosen near solution point x * and Fx * ðÞ is nonsingular; then, for some K c > 0, we have The convergence theorems and proof of Eq. (7) can be referred to [9,10]. Motivated by the excellent convergence of Newton method and low cost of Jacobian evaluation of chord method, a method due originally to Shamanskii [11,12] that lies between Newton method and chord method was proposed and has been analyzed in Kelly [9,[13][14][15]. Other variants of Newton methods with different Jacobian approximation schemes include [9,14,[16][17][18]. However, most of these methods require the computation and storage of the full or approximate Jacobian, which become very difficult and time-consuming as the dimension of systems increases [10,19].
It would be worthwhile to construct a derivative-free approach and analyze with existing techniques [20][21][22]. The aim of this work is to derive a diagonal matrix for the approximate Jacobian of Shamanskii method by means of variational techniques. The expectation would be to reduce computational cost, storage, and CPU time of evaluating any problem. The proposed method works efficiently by combining the good convergence rate of Shamanskii method and the derivate free approach employed, and the results are very encouraging. The next section presents the Shamanskii method for nonlinear systems of equations.

Shamanskii method
It is known that the Newton method defined in Eq. (2) converges quadratically to x * when the initial guess is sufficiently close to the root [7,10,19]. The major concern about this method is the evaluation and storage of the Jacobian matrix at every iteration [23]. A scheme that almost completely overcomes this is the chord method. This method factored the Jacobian matrix only once in the case of a finite dimensional problem, thereby reducing the evaluation cost of each iteration as in Eq. (3) and thereby degrading the convergence rate to linear [10].
Motivated by this, a method due originally to Shamanskii [11] was developed and analyzed by [7,13,14,16,24]. Starting with an initial approximation x 0 , this method uses the multiple pseudo-Newton approach as described below: after little simplification, we have This method converges superlinearly with q-order of at least t þ 1 when the initial approximation x 0 is sufficiently chosen near the solution point x * and F 0 x * ðÞ is nonsingular. This implies that there exists K s > 0, such that Combining Eq. (7) and the quadratic convergence of Newton method produces the convergence rate of the Shamanskii method as in Eq. (8). Thus, the balance is between the reduced evaluation cost of Fréchet derivative and Jacobian computation for Shamanskii method and Newton method rapid convergence. This low-cost derivative evaluation as well as the rapid convergence rate of several methods including the Shamanskii method has been studied and analyzed in [13,15]. From the analysis, the researchers conclude that that Shamanskii method has shown superior performance compared to Newton method in terms of efficiency whenever work is measured in terms of function evaluations [9]. Also, if the value of t is sufficiently chosen, then, as the dimension increases, the performance of the Shamanskii method improves and thus reduces the limit of complexity of factoring the approximate Jacobian for two pseudo-Newton iterations [14]. Please refer to [15] for the proof of the convergence theorem described below.
Theorem 2 [15]: Let F : D ⊂ R n ! R n conform hypotheses H1 2 ðÞ , H2, and H3. Then the solution point x * is a point of attraction of the Shamanskii iteration, i.e., Eq. (10), and this method possesses at least cubic order of convergence.

Diagonal updating scheme for solving nonlinear systems
Evaluation or inversion of the Jacobian matrix at every iteration or after few iterations does not seem relevant even though the computational cost has generally been reduced as in Shamanskii method [14,[25][26][27][28]. As a matter of fact, it can be easily shown that by adding a diagonal updating scheme to a method, we would have a new low memory iterative approach which would approximate the Jacobian F 0 x k ðÞ into nonsingular diagonal matrix that can be updated in every iteration [29][30][31]. Indeed, using the Shamanskii procedure, the proposed method avoids the main complexity of the Newton-type methods by reusing the evaluated Jacobian during the iteration process. This is the basic idea of the Shamanskii-like method which is described as follows.
Given an initial approximation x 0 , we compute Eq. (2) to obtain the Jacobian F 0 x k ðÞ and present a diagonal approximation to the Jacobian say D k as follows: Suppose s k ¼ x kþ1 À x k and y k ¼ Fx kþ1 ðÞ À Fx k ðÞ ; by mean value theorem (MVT), we have D kþ1 s k ≈ y k (13) Substituting Eq. (12) in Eq. (13), we have Since D kþ1 is the update of diagonal matrix D k , let us assume D kþ1 satisfy the weak secant equation: which would be used to minimize the deviation between D kþ1 and D k under some norms. The updated formula for D k follows after the theorem below: Theorem 3: Suppose D kþ1 is the update of the diagonal matrix D k and ∆ k ¼ D kþ1 À D k , s k 6 ¼ 0. Consider the problem such that Eq. (15) holds and : kk F denotes the Frobenius norm. From Eq. (16), we have the following solution also regarded as the optimal solution: where μ is the corresponding Lagrangian multiplier. Simplifying Eq. (18), we have and Also, for diagonal matrix D k , the element of the diagonal component is given as This completes the proof. ∎ Now, from the above description of the theorem, we deduce that the best possible diagonal update D kþ1 is as follows: However, for possibly small s k kk and trΩ k , we need to define a condition that would be applied for such cases. To this end, we require that s k kk ≥ s 1 for some chosen small Thus, the proposed accelerated method is described as follows: The performance of this proposed method would be tested on well-known benchmark problems employed by researchers on existing methods. This would be carried out using a designed computer code for its algorithm. The problems could be artificial or real-life problems. The artificial problems check the performance of any algorithm in situation such as point of singularity, function with many solutions, and null space effect as presented in Figures 1-3 [7,32].
While the real-life problems emerge from fields such as chemistry, engineering, management, etc., the real-life problems often contain large data or complex algebraic expression which makes it difficult to solve.

Numerical results
This section demonstrates the proposed method and illustrates its advantages on some benchmark problems with dimensions ranging from 25 to 1,000 variables.   These include problems with restrictions such as singular Jacobian or problems with only one point of singularity. To evaluate the performance of the proposed diagonal updating Shamanskii method (DUSM), we employ some tools by Dolan and Moré [33] and compare the performance with two classical Newton-type methods based on the number of iterations and CPU time in seconds. The methods include: 1. The Newton method (NM)

The Shamanskii method (SM)
These tools are used to represent the efficiency, robustness, and numerical comparisons of different algorithms. Suppose there exist n s solvers and n p problems; for each problem p and solver s, they define: Requiring a baseline for comparisons, they compared the performance on problem p by solver s with the best performance by any solver for this problem using the performance ratio: We suppose that parameter r m ≥ r p, s for all p, s is chosen and r p, s ¼ r M if and only if solver s does not solve problem p. The performance of solvers s on any given problem might be of interest, but because we would prefer obtaining the overall assessment of the performance of the solver, then it was defined as p s t ðÞ¼ 1 n p size p ∈ P : r p, s ≤ t ÈÉ : Thus, p s t ðÞwas the probability for solver s ∈ S that a performance ratio r p, s was within a factor t ∈ R of the best possible ratio. Then, function p s was the cumulative distribution function for the performance ratio. The performance profile p s : R ! 0; 1 ½ for a solver was nondecreasing, piecewise, and continuous from right. The value of p s 1 ðÞis the probability that the solver will win over the rest of the solvers. In general, a solver with high value of p τ ðÞor at the top right of the figure is preferable or represents the best solver.
All problems considered in this research are solved using MATLAB (R 2015a) subroutine programming [37]. This was run on an Intel® Core™ i5-2410M CPU @ 2.30 GHz processor, 4GB for RAM memory and Windows 7 Professional operating system. The termination condition is set as and the program has been designed to terminate whenever: • The number of iterations exceeds 500, and no point of x k satisfies the termination condition.
• The CPU time in seconds reaches 500.
• Insufficient memory to initiate the run.
At the point of failure due to any of the above conditions as in the tabulated results, it is assumed the number of iteration and CPU time is zero and thus that point has been denoted by " * ." The following are the details of the standard test problems, the initial points used, and the exact solutions for systems of nonlinear equations.
Problem 1 [31]: System of n nonlinear equations x 0 ¼ 1:1; 11:1; …; 1:1 ðÞ Table 1 shows the number of iterations (NI) and CPU time for Newton method (NM), Shamanskii method (SM), and the proposed diagonal updating method (DUSM), respectively. The performance of these methods was analyzed via storage locations and execution time. It can be observed that the proposed DUSM was able to solve the test problems perfectly, while NM and SM fail at some points due to the matrix being singular to working precision. This shows that the diagonal scheme employed has provided an option in the case of singularity, thereby reducing the computational cost of the classical Newton-type methods.

Conclusion
This chapter proposes a diagonal updating formula for systems of nonlinear equations which attributes to reduction in Jacobian evaluation cost. By computational experiments, we reach the conclusion that the proposed scheme is reliable and efficient and reduces Jacobian computational cost during the iteration process. Meanwhile, the proposed scheme is superior compared to the result of the classical and existing numerical methods for solving systems of equations.

Author details
Ibrahim Mohammed Sulaiman*, Mustafa Mamat and Umar Audu Omesa Universiti Sultan Zainal Abidin, Kuala Terengganu, Malaysia *Address all correspondence to: sulaimanib@unisza.edu.my © 2020 The Author(s). Licensee IntechOpen. This chapter is distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/ by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.