Open access peer-reviewed chapter

# Unconstrained Optimization Methods: Conjugate Gradient Methods and Trust-Region Methods

Written By

Snezana S. Djordjevic

Reviewed: January 14th, 2019 Published: February 19th, 2019

DOI: 10.5772/intechopen.84374

From the Edited Volume

## Applied Mathematics

Edited by Bruno Carpentieri

Chapter metrics overview

View Full Metrics

## Abstract

Here, we consider two important classes of unconstrained optimization methods: conjugate gradient methods and trust region methods. These two classes of methods are very interesting; it seems that they are never out of date. First, we consider conjugate gradient methods. We also illustrate the practical behavior of some conjugate gradient methods. Then, we study trust region methods. Considering these two classes of methods, we analyze some recent results.

### Keywords

• trust region methods

## 1. Introduction

Remind to the unconstrained optimization problem which we can present as

minxRnfx, E1

where f:RnRis a smooth function.

Here, we consider two classes of unconstrained optimization methods: conjugate gradient methods and trust region methods. Both of them are made with the aim to solve the unconstrained optimization problem (1).

In this chapter, at first, we consider the conjugate gradient methods. Then, we study trust region methods. Also, we try to give some of the most recent results in these areas.

## 2. Conjugate gradient method (shortly CG)

The conjugate gradient method is the method between the steepest descent method and the Newton method.

The conjugate gradient method in fact deflects the direction of the steepest descent method by adding to it a positive multiple of the direction used in the last step.

The restarting and the preconditioning are very important to improve the conjugate gradient method [47].

Some of well-known CGmethods are [12, 19, 20, 23, 24, 31, 39, 40, 49]:

βkHS=gk+1TykdkTyk
βkFR=gk+12gk2
βkPRP=gk+1Tykgk2
βkCD=gk+12dkTgk
βkLS=gk+1TykdkTgk
βkDY=gk+12dkTyk
βkN=yk2dkyk2dkTykTgk+1dkTyk
βkWYL=gkTgkgkgk1gk1gk12

fx=12xTGx+bTx+c, E2

where Gis an n×nsymmetric positive definite matrix, bRn, and cis a real number.

Theorem 1.2.1.[47] (Property theorem of conjugate gradient method) For positive definite quadratic function(2), FR conjugate gradient method with the exact line search terminates aftermnsteps, and the following properties hold for alli,0im:

diTGdj=0,j=0,1,,i1;
giTgj=0,j=0,1,,i1;
diTgi=giTgi;
g0g1gi=g0Gg0Gig0;

d0d1di=g0Gg0Gig0,

where mis the number of distinct eigenvalues of G.

Now, we give the algorithm of conjugate gradient method.

Algorithm 1.2.1. (CG method).

Assumptions:ε<0and x0Rn. Letk=0, t0=0, d1=0, d0=g0, β1=0, and β0=0.

Step 1. Ifgkε, then STOP.

Step 2. Calculate the step-sizetkby a line search.

Step 3. Calculateβkby any of the conjugate gradient method.

Step 4. Calculatedk=gk+βk1dk1.

Step 5. Setxk+1=xk+tkdk.

Step 6. Setk=k+1and go to Step 1.

### 2.1 Convergence of conjugate gradient methods

Theorem 1.2.2.[47] (Global convergence of FR conjugate gradient method) Suppose thatf:RnRis continuously differentiable on a bounded level set

L=xRnfxfx0,

and let FR method be implemented by the exact line search. Then, the produced sequence xkhas at least one accumulation point, which is a stationary point, i.e.:

1. Whenxkis a finite sequence, then the final pointxis a stationary point off.

2. Whenxkis an infinite sequence, then it has a limit point, and it is a stationary point.

In [35], a comparison of two methods, the steepest descent method and the conjugate gradient method which are used for solving systems of linear equations, is illustrated. The aim of the research is to analyze, which method is faster in solving these equations and how many iterations are needed by each method for solving.

The system of linear equations in the general form is considered:

Ax=B, E3

where matrix Ais symmetric and positive definite.

The conclusion is that the SDmethod is a faster method than the CG, because it solves equations in less amount of time.

By the other side, the authors find that the CGmethod is slower but more productive than the SD, because it converges after less iterations.

So, we can see that one method can be used when we want to find solution very fast and another can converge to maximum in less number of iterations.

Again, we consider the problem (1), where f:RnRis a smooth function and its gradient is available.

A hybrid conjugate gradient method is a certain combination of different conjugate gradient methods; it is made to improve the behavior of these methods and to avoid the jamming phenomenon.

An excellent survey of hybrid conjugate gradient methods is given in [5].

Three-term conjugate gradient methods were studied in the past (e.g., see [8, 32, 34], etc.); but, from recent papers about CGmethods, we can conclude that maybe the mainstream is made by three-term and even four-term conjugate gradient methods. An interesting paper about a five-term hybrid conjugate gradient method is [1]. Also, from recent papers we can conclude that different modifications of the existing CGmethods are made, as well as different hybridizations of CGand BFGSmethods.

Consider unconstrained optimization problem (1), where f:RnRis a continuously differentiable function, bounded from below. Starting from an initial point x0Rn, the three-term conjugate gradient method with line search generates a sequence xk, given by the next iterative scheme:

xk+1=xk+tkdk, E4

where tkis a step-size which is obtained from the line search, and

d0=g0,dk+1=gk+1+δksk+ηkyk.

In the last relation, δkand ηkare the conjugate gradient parameters, sk=xk+1xk, gk=fxk, and yk=gk+1gk. We can see that the search direction dk+1is computed as a linear combination of gk+1, sk, and yk.

In [6], the author suggests another way to get three-term conjugate gradient algorithms by minimization of the one-parameter quadratic model of the function f. The idea is to consider the quadratic approximation of the function fin the current point and to determine the search direction by minimization of this quadratic model. It is assumed that the symmetrical approximation of the Hessian matrix Bk+1satisfies the general quasi-Newton equation which depends on a positive parameter:

Bk+1sk=ω1yk,ω0. E5

In this paper the quadratic approximation of the function fis considered:

Φk+1d=fk+1+gk+1Td+12dTBk+1d.

The direction dk+1is computed as

dk+1=gk+1+βksk, E6

where the scalar βkis determined as the solution of the following minimizing problem:

minβkRΦk+1dk+1. E7

From (6) and (7), the author obtains

βk=gk+1TBk+1skgk+1TskskTBk+1sk. E8

Using (5), from (7), the next expression for βkis obtained:

βk=gk+1Tykωgk+1TskykTsk. E9

Using the idea of Perry [36], the author obtains

dk+1=gk+1+ykTgk+1ωskTgk+1ykTskskskTgk+1ykTskyk.

In fact, in this approach the author gets a family of three-term conjugate gradient algorithms depending of a positive parameter ω.

Next, in [52], the WYLconjugate gradient (CG) formula, with βkWYL0, is further studied. A three-term WYLCGalgorithm is presented, which has the sufficiently descent property without any conditions. The global convergence and the linear convergence are proven; moreover, the n-step quadratic convergence with a restart strategy is established if the initial step length is appropriately chosen.

The first three-term Hestenes-Stiefel (HS) method (TTHSmethod) can be found in [55].

Baluch et al. [7] describe a modified three-term Hestenes-Stiefel (HS) method. Although the earliest conjugate gradient method HSachieves global convergence using an exact line search, this is not guaranteed in the case of an inexact line search. In addition, the HSmethod does not usually satisfy the descent property. The modified three-term conjugate gradient method from [7] possesses a sufficient descent property regardless of the type of line search and guarantees global convergence using the inexact Wolfe-Powell line search [50, 51]. The authors also prove the global convergence of this method. The search direction, which is considered in [7], has the next form:

where βkBZA=gkTgkgk1dk1Tyk1+μgkTdk1, θkBZA=gkTdk1dk1Tyk1+μgkTdk1,μ>1.

In [13], an accelerated three-term conjugate gradient method is proposed, in which the search direction satisfies the sufficient descent condition as well as extended Dai-Liao conjugacy condition:

dkTyk1=tgkTsk1,t0.

This method seems different from the existent methods.

Next, Li-Fushikuma quasi-Newton equation is

2fxksk1=zk1, E10

where

zk1=yk1+Cgk1rsk1+maxsk1Tyk1sk120sk1,

where Cand rare two given positive constants. Based on (10), Zhou and Zhang [56] propose a modified version of DLmethod, called ZZmethod in [13].

In [30], some new conjugate gradient methods are extended, and then some three-term conjugate gradient methods are constructed. Namely, the authors remind to [41, 42], with its conjugate gradient parameters, respectively:

βkRMIL=gkTyk1dk12, E11
βkMRMIL=gkTgkgk1dk1dk12, E12

wherefrom it is obvious that βkMRMIL=βkRMILfor the exact line search. Let us say that these methods, presented in [41, 42], are RMILand MRMILmethods.

The three-term RMILand MRMILmethods are introduced in [30].

The search direction dkcan be expressed as

d0=g0,dk=gk+βkdk1+θkyk1,

where βkis given by (11) or (12), and

θk=gkTdk1dk12.

An important property of the proposed methods is that the search direction always satisfies the sufficient descent condition without any line search, that is, the next relation always holds

gkTdkgk2.

Under the standard Wolfe line search and the classical assumptions, the global convergence properties of the proposed methods are proven.

Having in view the conjugate gradient parameter suggested in [49], in [45] the next two conjugate gradient parameters are presented:

βkMHS=gk2gkgk1gkTgk1dk1Tgkgk1, E13
βkMLS=gk2gkgk1gkTgk1dk1Tgk1. E14

Motivated by [49], as well as by [45], in [1], a new hybrid nonlinear CGmethod is proposed; it combines the features of five different CGmethods, with the aim of combining the positive features of different non-hybrid methods. The proposed method generates descent directions independently of the line search. Under some assumptions on the objective function, the global convergence is proven under the standard Wolfe line search. Conjugate gradient parameter, proposed in [1], is

βkhAO=gk2max0gkgk1gkTgk1maxgk12dk1Tgkgk1dk1Tgk1. E15

Let’s note that the proposed method is hybrid of FR, DY, WYL, MHS, and MLS.

The behaviors of the methods BZA, TTRMIL, MRMIL, MHS, MLS, and hAOare illustrated by the next tables.

The test criterion is CPU time.

The tests are performed on the computer Workstation Intel Celeron CPU 1,9 GHz.

The experiments are made on the test functions from [3].

Each problem is tested for a number of variables n=1000and n=5000.

The average CPU time values are given in the last rows of these tables (Tables 1, 2, 3, 4).

functionBZATTRMILMRMILMHSMLShAO
Ext.Pen.21.79334020.96653416.03690319.81212721.93374120.326930
Raydan16.8016447.0668456.3492417.0980457.0668457.332047
Raydan20.6084040.5928040.5772040.5928040.6084040.639604
Diag.10.6084040.6084040.5772040.6084040.5148030.577204
Diag.25.1636335.6004364.6956304.7580315.6628364.851631
Diag.35.6160365.6940375.2416345.7564375.5848365.506835
Gen.Tridiag.-13.0420192.9328192.6832172.9484192.7924182.808018
Hager2.9172192.9328192.6208173.0420192.9172192.886019
Ext.Tridiag.-12.8860192.9328192.7612182.9328192.7300182.917219
Ext.ThreeExp.2.9796192.9640192.6052172.8860193.0420192.714417
Diag.42.9016192.8704182.5740162.7924182.9484192.652017
Diag.52.7924182.9172192.5740162.9016193.0264192.901619
Ext.Himm.2.7612182.7144172.6676172.9640192.9952192.854818
Ext.PSC12.9328192.7456182.7144172.5116163.0264192.792418
FullHess.FH22.8704182.9484192.8860192.8392183.0108192.948419
Ext.Bl.Diag.BD12.9796192.8860192.9484192.8860192.9016192.542816
Ext.EP12.7300182.4024152.9328192.6988172.7924182.652017
Ext.Tridiag.-22.6832172.6052172.8392182.8704182.8860192.542816
Tridia2.6832172.5116162.9640192.8236182.8236182.511616
Dqdrtic2.7612182.9952192.9016192.8236182.7300182.589617
Quartc(Cute)2.8860192.7768182.8860192.7768182.8704182.839218
Dixon3dq(Cute)2.8080182.9484192.9484192.8392182.9172192.605217

### Table 1.

n= 1000.

functionBZATTRMILMRMILMHSMLShAO
Biggsb1(Cute)2.7924182.8704182.8704182.9172192.9796192.901619
Gen.quart.2.9172192.9328192.4648162.9484192.8080182.620817
Diag.72.5740162.5896172.8704182.6208173.0264192.698817
Diag.82.7300182.9796192.8392182.9640192.7924182.979619
Full Hess.FH32.9484192.5740162.6988173.0264192.6364172.745618
Himmelbg2.8548183.0108192.9016192.8548182.9952192.730018
Ext.Pow.2.9016192.8548182.7612182.8080182.8704182.995219
Ext.Maratos2.8548182.9484192.8704182.9952192.8704182.917219
Ext.Cliff2.9640193.0420192.8548182.9328192.8860192.854818
Ext.Wood2.9952192.9328192.9484192.9484192.9640192.948419
Ext.Trigon.2.7924182.9952192.8392183.0108192.9952192.745618
Ext.Rosenbr.2.9640192.8392182.9484192.9328192.9952192.776818
Average3.9156253.9281053.5334233.8680453.9733453.722184

### Table 2.

n = 1000.

functionBZATTRMILMRMILMHSMLShAO
Ext.Pen.46.16069646.83150048.65671266.28482565.86362263.695208
Raydan112.99488312.10567813.75928816.97290916.59850616.754507
Raydan21.1700081.0296071.0764071.1544071.0920071.107607
Diag.18.8452570.9048061.0764071.1232071.1700081.092007
Diag.28.6580557.8312507.9248519.09485810.35846610.327266
Diag.38.3616549.1416598.67365610.68606810.35846610.514467
Gen.Tridiag.-15.6160365.3820345.8656386.0216396.4896426.364841
Hager5.2416344.8516315.8812386.2868405.3040346.021639
Ext.Tridiag.-15.0076324.8048315.7408375.7876376.2244405.803237
Ext.ThreeExp.4.8828314.8204315.5224356.1152396.3336415.834437
Diag.44.9296324.8984315.1792335.8032376.1776406.427241
Diag.55.6940374.8516315.5380365.7096375.8968386.115239
Ext.Himm.5.8344375.1168335.3820346.0996395.7720376.411641
Ext.PSC15.0232325.0544325.1636336.4116416.1152395.990438
FullHess.FH25.2104334.9296324.8516316.0684396.3492416.349241
Ext.Bl.Diag.BD14.8516315.0076325.2260336.3648416.3648415.569236
Ext.EP15.0700325.0388326.0528396.1152394.9920326.177640
Ext.Tridiag.-24.8516314.9764324.8516316.3492415.9904386.099639
Tridia5.4132354.8204315.4756355.5692365.8188376.021639
Dqdrtic5.1636334.9452325.0232325.4288356.0060385.850038
Quartc(Cute)5.9124385.3508345.8344375.7876375.8968386.193240
Dixon3dq(Cute)5.4288354.7892315.1636336.1620395.6160365.881238

### Table 3.

n = 5000.

functionBZATTRMILMRMILMHSMLShAO
Biggsb1(Cute)5.1480334.6956305.4132355.9124386.0528396.349241
Gen.quart.5.2884344.7580315.0232326.3492416.0528394.960832
Diag.75.1636334.6644305.0544325.9592386.1932406.255640
Diag.85.7876374.7424304.8984316.0996395.6004366.208840
Full Hess.FH35.4444354.7892315.5692366.1776406.1620396.224440
Himmelbg5.5848366.1308395.4756355.4756356.0060385.912438
Ext.Pow.5.5692364.7892314.7736315.9904385.7720376.162039
Ext.Maratos5.1480335.7408374.9764326.0216396.2868406.130839
Ext.Cliff5.9436385.8500384.9764325.9904385.3040346.286840
Ext.Wood5.5848365.6472364.7892316.2556405.3508346.021639
Ext.Trigon.5.3664345.7096374.7736316.1152396.0216395.787637
Ext.Rosenbr.6.1776405.3196344.6176306.3336416.0216396.021639
Average7.793029957.3273677.6947499.2875195259.2067899.225899

### Table 4.

n = 5000.

In [2], based on the numerical efficiency of Hestenes-Stiefel (HS) method, a new modified HSalgorithm is proposed for unconstrained optimization. The new direction independent of the line search satisfies the sufficient descent condition. Motivated by theoretical and numerical features of three-term conjugate gradient (CG) methods proposed by [33], similar to the approach in [10], the new direction is computed by minimizing the distance between the CGdirection and the direction of the three-term CGmethods proposed by [33]. Under some mild conditions, the global convergence of the new method for general functions is established when the standard Wolfe line search is used. In this paper the conjugate gradient parameter is given by

βk=βkHSθk, E16

where

θk=1gkTdk12gk2dk12.

But this new CGdirection does not fulfill a descent condition, so further modification is made, namely, having in view [53], the authors [2] introduce

β¯k=βkλyk1θkdk1Tyk12gkTdk1,

where λ>14is a parameter. Also, the global convergence is proven under standard conditions.

It is worth to mention the next papers about this theme, which can be interesting [4, 14, 15, 16, 17, 25, 26, 27].

## 3. Trust region methods

We remind that the basic idea of Newton method is to approximate the objective function fxaround xkby using a quadratic model:

qks=fxk+gkTs+12skTGks,

where gk=fxk, Gk=2fxk, and also use the minimizer skof qksto set xk+1=xk+sk.

Also, remind that Newton method can only guarantee the local convergence, i.e., when sis small enough and the method is convergent locally.

Further, Newton method cannot be used when Hessian is not positive definite.

There exists another class of methods, known as trust region methods. It does not use the line search to get the global convergence, as well as it avoids the difficulty which is the consequence of the nonpositive definite Hessian in the line search.

Furthermore, it produces greater reduction of the function fthan line search approaches.

Here, we define the region around the current iterate:

Ωk=x:xxkΔk,

where Δkis the radius of Ωk, inside which the model is trusted to be adequate to the objective function.

Our further intention is to choose a step which should be the approximate minimizer of the quadratic model in the trust region. In fact, xk+skshould be the approximately best point on the sphere:

xk+ssΔk,

with the center xkand the radius Δk.

In the case that this step is not acceptable, we reduce the size of the step, and then we find a new minimizer.

This method has the rapid local convergence rate, and that’s the property of Newton method and quasi-Newton method, too, but the important characteristic of trust region method is also the global convergence.

Since the step is restricted by the trust region, this method is also called the restricted step method.

The model subproblem of the trust region method is

minqks=fxk+gkTs+12sTBks, E17
s.t.sΔk, E18

where Δkis the trust region radius and Bkis a symmetric approximation of the Hessian Gk.

In the case that we use the standard l2norm 2, skis the minimizer of qksin the ball of radius Δk. Generally, different norms define the different shapes of the trust region.

Setting Bk=Gkin (17)(18), the method becomes a Newton-type trust region method.

The problem by itself is the choice of Δkat each single iteration.

If the agreement between the model qksand the objective function fxk+sis satisfactory enough, the value Δkshould be chosen as large as it is possible. The expression Aredk=fxkfxk+skis called the actual reduction, and the expression Predk=qk0qkskis called the predicted reduction; here, we emphasize that

rk=AredkPredk

measures the agreement between the model function qksand the objective function fxk+s.

If rkis close to 0 or it is negative, the trust region is going to shrink; otherwise, we do not change the trust region.

The conclusion is that rkis important in making the choice of new iterate xk+1as well as in updating the trust region radius Δk. Now, we give the trust region algorithm.

Algorithm 1.3.1.(Trust region method).

Assumptions: x0, Δ¯, Δ00Δ¯, ε0, 0<η1η2<1, and 0<γ1<1<γ2.

Letk=0.

Step 1. Ifgkε, then STOP.

Step 2. Approximately solve the problem(17)(18) forsk.

Step 3. Computefxk+skandrk. Set

xk+1=xk+sk,ifrkη1,xk,otherwise.

Step 4. Ifrk<η1, thenΔk+10γ1Δk.

Ifrkη1η2, thenΔk+1γ1ΔkΔk.

Ifrkη2andsk=Δk, thenΔk+1Δkminγ2ΔkΔ¯.

Step 5. GenerateBk+1, updateqk, setk=k+1, and go to Step 1.

In Algorithm 1.3.1, Δ¯is a bound for all Δk. Those iterations with the property rkη2(and so those for which Δk+1Δk) are called very successful iterations; the iterations with the property rkη1(and so those for which xk+1=xk+sk) are called successful iterations; and the iterations with the property rk<η1(and so those for which xk+1=xk) are called unsuccessful iterations. Generally, the iterations from the two first cases are called successful iterations.

Some choices of parameters are η1=0,01, η2=0,75, γ1=0,5, γ2=2, Δ0=1, and Δ0=110g0. The algorithm is insensitive to change of these parameters.

Next, if rk<0,01, then Δk+1can be chosen from 0.01,0.5skon the basis of a polynomial interpolation.

In the case of quadratic interpolation, we set

Δk+1=λsk,

where

λ=gkTsk2fxk+skfxkgkTsk.

### 3.1 Convergence of trust region methods

Assumption 1.3.1(AssumptionA0).

We assume that the approximations of HessianBkare uniformly bounded in norm and the level setL=xfxfx0is bounded, as well asf:RnRis continuously differentiable onL. We allow the length of the approximate solutionskof the subproblem(17)(18) to exceed the bound of the trust region, but we also assume that

skη˜Δk,

whereη˜is a positive constant.

In this kind of trust region way of thinking, generally we do not seek an accurate solution of the subproblem (17)(18); we are satisfied by finding a nearly optimal solution of the subproblem (17)(18).

Strong theoretical as well as numerical results can be obtained if the step sk, produced by Algorithm 1.3.1, satisfies

qk0qkskβ1gk2minΔkgk2Bk2,β101.

Theorem 1.3.1[47] Under AssumptionA0, if Algorithm 3.1 has finitely many successful iterations, then it converges to the first-order stationary point.

Theorem 1.3.2[47] Under AssumptionA0, if Algorithm 3.1 has infinitely many successful iterations, then

liminfkgk=0.

In [44], it is emphasized that trust region methods are very effective for optimization problems and a new adaptive trust region method is presented. This method combines a modified secant equation with the BFGSupdate formula and an adaptive trust region radius, where the new trust region radius makes use of not only the function information but also the gradient information. Let B̂kbe a positively definite matrix based on modified Cholesky factorization [43]. Under suitable conditions, in [44] the global convergence is proven; also, the local superlinear convergence of the proposed method is demonstrated. Motivated by the adaptive technique, the proposed method possesses the following nice properties:

1. The trust region radius uses not only the gradient value but also the function value.

2. Computing the matrix B̂kof the inverse and the value of B̂k1, at each iterative point xk, is not required.

3. The computational time is reduced.

A modified secant equation is introduced:

Bk+1dk=qk, E19

where qk=yk+hkdk, fk=fxk, and hk=gk+1+gkTdk+2fkfk+1dk2.

When fis twice continuously differentiable and Bk+1is generated by the BFGSformula, where B0=I, this modified secant Eq. (19) possesses the following nice property:

fk=fk+1gk+1Tdk+12dkTBk+1dk,

and this property holds for all k.

Under classical assumptions, the global convergence of the method presented in [44] is also proven in this paper.

In [28], the hybridization of monotone and non-monotone approaches is made; a modified trust region ratio is used, in which more information is provided about the agreement between the exact and the approximate models. An adaptive trust region radius is used, as well as two accelerated Armijo-type line search strategies to avoid resolving the trust region subproblem whenever a trial step is rejected. It is shown that the proposed algorithm is globally and locally superlinearly convergent. In this paper trust region methods are denoted shortly by TR; it is emphasized that in TRmethod, having in view that the iterative scheme is

x0Rn,xk+1=xk+sk,k=0,1,,

and it often happens that skis an approximate solution of the following quadratic subproblem:

minsRn,skΔkmks=gkTs+12skTBks. E20

Performance of the TRmethods is much influenced by the strategy of choosing the TRradius at each iteration. To determine the radius Δk, in the standard TRmethod, the agreement between fxk+sand mksis evaluated by the so-called TRratio ρk:

ρk=fxkfxk+skmk0mksk.

When ρkis negative or a small positive number near to zero, the quadratic model is a poor approximation of the objective function. In such situation, Δkshould be decreased and, consequently, the subproblem (20) should be solved again. However, when ρkis close to 1, it is reasonable to use the quadratic model as an approximation of the objective function. So, the step skshould be accepted and Δkcan be increased. Here, the authors use the modified version of ρk:

ρ¯k=Rkfxk+skPkmksk,

where Rk=ηkflk+1ηkfk, ηkηminηmax, ηmin01, and ηmaxηmin1. Also,

flk=max0jqkfkj,fi=fxi,q0=0,0qkminqk1+1N,

where NNwhich is originally used by Toint [48].

Something more about trust region methods can be found in [9, 18, 21, 22, 54].

## 4. Conclusion

The conjugate gradient methods and trust region methods are very popular now.

Many scientists consider these methods.

Namely, different modifications of these methods are made, with the aim to improve them.

Next, the scientists try to make not only new methods but also whole new classes of methods. For the specific values of the parameters, individual methods are distinguished from these classes. It is always more desirable to make a class of methods instead of individual methods.

Hybrid conjugate gradient methods are made in many different ways; this class of conjugate gradient methods is always actual.

Further, one of the contemporary trends is to use BFGSupdate in constructions of new conjugate gradient methods (e.g., see [46]).

Finally, let us emphasize that contemporary papers often use the Picard-Mann-Ishikawa iterative processes and they make the connection of these kinds of processes with the unconstrained optimization (see [29, 37, 38]).

## References

1. 1. Adeleke OJ, Osinuga IA. A five-term hybrid conjugate gradient method with global convergence and descent properties for unconstrained optimization problems. Asian Journal of Scientific Research. 2018;11:185-194
2. 2. Amini K, Faramarzi P, Pirfalah N. A modified Hestenes-Stiefel conjugate gradient method with an optimal property. Optimization Methods and Software. In press. 2018. DOI: 10.1080/10556788.2018.1457150
3. 3. Andrei N. An unconstrained optimization test functions. Advanced Modeling and Optimization. An Electronic International Journal. 2008;10:147-161
4. 4. Andrei N. Acceleration of conjugate gradient algorithms for unconstrained optimization. Applied Mathematics and Computation. 2009;213:361-369
5. 5. Andrei N. 40 Conjugate Gradient Algorithms For Unconstrained Optimization. A Survey on Their Definition. ICI Technical Report No. 13/08, March 14, 2008
6. 6. Andrei N. A new three-term conjugate gradient algorithm for unconstrained optimization. Numerical Algorithms. 2015;68:305-321
7. 7. Baluch B, Salleh Z, Alhawarat A. A new modified three-term Hestenes-Stiefel conjugate gradient method with sufficient descent property and its global convergence. Hindawi Journal of Optimization. 2017;2017:1-13
8. 8. Beale EML. A derivative of conjugate gradients. In: Lootsma FA, editor. Numerical Methods for Nonlinear Optimization. London: Academic; 1972. pp. 39-43
9. 9. Curtis FE, Lubberts Z, Robinson DP. Concise complexity analyses for trust-region methods. Optimization Letters. 2018;12(8):1713-1724
10. 10. Dai YH, Kou CX. A nonlinear conjugate gradient algorithm with an optimal property and an improved Wolfe line search. SIAM Journal on Optimization. 2013;23(1):296-320
11. 11. Dai Y, Yuan Y. Alternate minimization gradient method. IMA Journal of Numerical Analysis. 2003;23:377-393
12. 12. Dai YH, Yuan Y. A nonlinear conjugate gradient method with a strong global convergence property. SIAM Journal on Optimization. 1999;10:177-182
13. 13. Dong X-L, Han D, Dai Z, Li L, Zhu J. An accelerated three-term conjugate gradient method with sufficient descent condition and conjugacy condition. Journal of Optimization Theory and Applications. 2018;179:944-961
14. 14. Du S, Chen M. A new smoothing modified three-term conjugate gradient method forl1-norm minimization problem. Journal of Inequalities and Applications. 2018;2018(1):1-14. SpringerOpen
15. 15. Djordjević SS. New hybrid conjugate gradient method as a convex combination of FR and PRP methods. Univerzitet u Nišu. 2016;30(11):3083-3100
16. 16. Djordjević SS. New hybrid conjugate gradient method as a convex combination of LS and CD methods. Univerzitet u Nišu. 2016;31(6):1813-1825
17. 17. Djordjevic S. New hybrid conjugate gradient method as a convex combination of Ls and Fr methods. Acta Mathematica Scientia. 2019;39(1):214-228
18. 18. El-Sobky B, Abo-Elnaga Y. A penalty method with trust-region mechanism for nonlinear bilevel optimization problem. Journal of Computational and Applied Mathematics. 2018;340:360-374
19. 19. Fletcher R, Reeves C. Function minimization by conjugate gradients. The Computer Journal. 1964;7:149-154
20. 20. Fletcher R. Practical methods of optimization. In: Unconstrained Optimization. Vol. 1. New York: John Wiley & Sons; 1987
21. 21. Gao J, Cao J. A class of derivative-free trust-region methods with interior backtracking technique for nonlinear optimization problems subject to linear inequality constraints. Journal of Inequalities and Applications. 2018;2018(1):108;https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5942389/
22. 22. Gertz EM, Gill PE. A primal-dual trust region algorithm for nonlinear optimization. Mathematical Programming, Series B. 2004;100:49-94
23. 23. Hager WW, Zhang H. A new conjugate gradient method with guaranteed descent and an efficient line search. SIAM Journal on Optimization. 2003;16(1):170-192
24. 24. Hestenes MR, Stiefel EL. Methods of conjugate gradients for solving linear systems. Journal of Research of the National Bureau of Standards. 1952;49:409-436
25. 25. Huang H, Lin S. A modified Wei-Yao-Liu conjugate gradient method for unconstrained optimization. Applied Mathematics and Computation. 2014;231:179-186
26. 26. Huang H et al. The proof of the sufficient descent condition of the Wei-Yao-Liu conjugate gradient method under the strong Wolfe-Powell line search. Applied Mathematics and Computation. 2007;189(2):1241-1245
27. 27. Jiang Z, Xu N. Hot spot thermal floor plan solver using conjugate gradient to speed up. Mobile Information Systems. 2018;2018:1-8
28. 28. Babaie-Kafaki S, Rezaee S. Two accelerated nonmonotone adaptive trust region line search methods. Numerical Algorithms. 2018;78:911-928
29. 29. Khan SH. A Picard-Mann hybrid iterative process. Fixed Point Theory and Applications. 2013;2013:69. DOI: 10.1186/1687-1812-2013-69
30. 30. Liu JK, Feng YM, Zou LM. Some three-term conjugate gradient methods with the inexact line search condition. Calcolo. 2018;55:16
31. 31. Liu Y, Storey C. Efficient generalized conjugate gradient algorithms, Part 1: Theory. Journal of Optimization Theory and Applications. 1991;69:129-137
32. 32. McGuire MF, Wolfe P. Evaluating a Restart Procedure for Conjugate Gradients, Report RC-4382. Yorktown Heights: IBM Research Center; 1973
33. 33. Narushima Y, Yabe H, Ford JA. A three-term conjugate gradient method with sufficient descent property for unconstrained optimization. SIAM Journal on Optimization. 2011;21:212-230
34. 34. Nazareth L. A conjugate direction algorithm without line search. Journal of Optimization Theory and Applications. 1977;23:373-387
35. 35. Osadcha O, Marszaek Z. Comparison of steepest descent method and conjugate gradient method. In: CEUR Workshop Proceedings, SYSTEM 2017 - Proceedings of the Symposium for Young Scientists in Technology, Engineering and Mathematics. 2017;1853:1-4
36. 36. Perry A. Technical Note - A modified conjugate gradient algorithm. Operations Research. 1978;26(6):1073-1078
37. 37. Petrovic M, Rakocevic V, Kontrec N, et al. Hybridization of accelerated gradient descent method. Numerical Algorithms. 2018;79:769-786. DOI: 10.1007/s11075-017-0460-4
38. 38. Petrović MJ, Stanimirović PS, Kontrec N, Mladenov J. Hybrid modification of accelerated double direction method. Mathematical Problems in Engineering. 2018;2018:1-8
39. 39. Polak E, Ribiére G. Note sur la convergence de de directions conjugées. In: Rev. Francaise Informat Recherche Opertionelle, 3e Année. Vol. 16. 1969. pp. 35-43
40. 40. Polyak BT. The conjugate gradient method in extreme problems. USSR Computational Mathematics and Mathematical Physics. 1969;9:94-112
41. 41. Rivaie M, Mamat M, June LW, Mohd I. A new class of nonlinear conjugate gradient coefficient with global convergence properties. Applied Mathematics and Computation. 2012;218:11323-11332
42. 42. Rivaie M, Mamat M, Abashar A. A new class of nonlinear conjugate gradient coefficients with exact and inexact line searches. Applied Mathematics and Computation. 2015;268:1152-1163
43. 43. Schnabel RB, Eskow E. A new modified Cholesky factorization. SIAM Journal on Scientific Computing. 1990;11:1136-1158
44. 44. Sheng Z, Yuan G, CUI Z. A new adaptive trust region algorithm for optimization problems. Acta Mathematica Scientia. 2018;38B(2):479-496
45. 45. Shengwei Y, Wei Z, Huang H. A note about Wyl’s conjugate gradient method and its applications. Applied Mathematics and Computation. 2007;191:381-388
46. 46. Stanimirović PS, Ivanov B, Djordjević S, Brajević I. New hybrid conjugate gradient and Broyden–Fletcher–Goldfarb–Shanno Conjugate Gradient Methods. Journal of Optimization Theory and Applications. 2018;178:860-884
47. 47. Sun W, Yuan Y-X. Optimization theory and methods: Nonlinear programming. Springer Optimization and Its Applications. 2006;1
48. 48. Toint PhL. Nonmonotone trust region algorithms for nonlinear optimization subject to convex constraints. Mathematical Programming. 1997;77:69. DOI: 10.1007/BF02614518
49. 49. Wei Z, Yao S, Liu L. The convergence properties of some new conjugate gradient methods. Applied Mathematics and Computation. 2006;183(2):1341-1350
50. 50. Wolfe P. Convergence conditions for ascent methods. SIAM Review. 1969;11:226-235
51. 51. Wolfe P. Convergence conditions for ascent methods II: Some corrections. SIAM Review. 1969;11:226-235
52. 52. Wu G, Li Y, Yuan G. A three-term conjugate gradient algorithm with quadratic convergence for unconstrained optimization problems. Hindawi, Mathematical Problems in Engineering. 2018, Article ID: 4813030, 15 p. DOI: 10.1155/2018/4813030
53. 53. Yu GH, Guan LT, Chen WF. Spectral conjugate gradient methods with sufficient descent property for large scale unconstrained optimization. Optimization Methods and Software. 2008;23(2):275-293
54. 54. Zhang X, Zhang J, Liao L. An adaptive trust region method and its convergence. Science in China, Series A: Mathematics. 2002;45A:620-631
55. 55. Zhang L, Zhou W, Li D. Some descent three-term conjugate gradient methods and their global convergence. Optimization Methods and Software. 2007;22(4):697-711
56. 56. Zhou W, Zhang L. A nonlinear conjugate gradient method based on theMBFGSsecant condition. Optimization Methods and Software. 2006;21(5):707-714

Written By

Snezana S. Djordjevic

Reviewed: January 14th, 2019 Published: February 19th, 2019