Open access peer-reviewed chapter - ONLINE FIRST

Fast Computation of the EM Algorithm for Mixture Models

By Masahiro Kuroda

Reviewed: October 15th 2021Published: December 8th 2021

DOI: 10.5772/intechopen.101249

Abstract

Mixture models become increasingly popular due to their modeling flexibility and are applied to the clustering and classification of heterogeneous data. The EM algorithm is largely used for the maximum likelihood estimation of mixture models because the algorithm is stable in convergence and simple in implementation. Despite such advantages, it is pointed out that the EM algorithm is local and has slow convergence as the main drawback. To avoid the local convergence of the EM algorithm, multiple runs from several different initial values are usually used. Then the algorithm may take a large number of iterations and long computation time to find the maximum likelihood estimates. The speedup of computation of the EM algorithm is available for these problems. We give the algorithms to accelerate the convergence of the EM algorithm and apply them to mixture model estimation. Numerical experiments examine the performance of the acceleration algorithms in terms of the number of iterations and computation time.

Keywords

• the EM algorithm
• normal mixture models
• acceleration of convergence
• the vector ε algorithm
• restarting procedure
• initial value selection
• the emEM algorithm

1. Introduction

Mixture models become increasingly popular due to their modeling flexibility and are applied to the clustering and classification of heterogeneous data, see [1, 2, 3]. The EM algorithm  is largely used for the maximum likelihood estimation of mixture models because the algorithm is stable in convergence and simple in implementation. Despite such advantages, it is pointed out that the EM algorithm is local and has slow convergence as the main drawback.

To circumvent the problem of slow convergence of the EM algorithm, various acceleration algorithms incorporating optimization methods are proposed. The optimization methods include the multivariate Aitken method , the conjugate gradient method , and the quasi-Newton method [7, 8]. However, these methods require matrix computation such as matrix inversion or evaluation of Hessian and Jacobian matrices and a line search for step length optimization. Therefore, their acceleration algorithms tend to lack one or more of the nice properties of the EM algorithm, although they may converge faster than the EM algorithm.

As another approach, the ε-accelerated EM algorithm  is proposed to accelerate the convergence of the EM algorithm by using the vector ε(vε) algorithm  that is a vector extrapolation algorithm [11, 12]. The vεalgorithm can accelerate the convergence of the sequence of estimates from the EM algorithm, and therefore, the ε-accelerated EM algorithm does not require any modification of the E- and M-steps of the EM algorithm. This point is the advantage of the ε-accelerated EM algorithm over other acceleration algorithms using the optimization methods. To reduce the number of iterations and computation time of the ε-accelerated EM algorithm, the εR-accelerated EM algorithm  is developed. The algorithm improves the computation speed of the ε-accelerated EM algorithm by embedding a restarting procedure. Then the restarting procedure finds a value for restarting the EM iterations such that a newly generated sequence of EM iterations from the value moves quickly into a neighborhood of a stationary point. We use the ε-accelerated EM and εR-accelerated EM algorithms for parameter estimation.

In application of the EM algorithm to mixture models, the algorithm is sensitive to the choice of the initial value and may find estimates at a local maximum of the log-likelihood function. Several strategies are proposed to efficiently initiate the EM algorithm for getting the global maximum of the log-likelihood function, see [14, 15, 16, 17]. We use the emEM algorithm  for the mixture model estimation and improve its computation speed by the ε-accelerated EM and εR-accelerated EM algorithms.

The chapter is organized as follows. Section 2 describes the EM algorithm for normal mixture models. In Section 3, we introduce the ε-accelerated EM and εR-accelerated EM algorithms. Section 4 presents numerical experiments to evaluate the performance of these acceleration algorithms. In Section 5, we provide an acceleration algorithm that applies the ε-accelerated EM and εR-accelerated EM algorithms to the emEM algorithm. Numerical experiments in Section 6 study the effects of these acceleration algorithms on the emEM algorithm. In Section 7, we present our concluding remarks.

2. The EM algorithm for normal mixture models

Let Y1,,Ynbe p-dimensional random vectors. Assume that an observed data vector yiof Yiarises from a mixture distribution of Gcomponents with density

fyiθ=k=1GλkϕyiμkΣk,E1

where ϕyiμkΣkis the k-th component density of a p-variate normal distribution NpμkΣkwith mean vector μk, variance–covariance matrix Σk, λkis the k-th mixing proportion such that 0<λk<1and k=1Gλk=1, and θ=λ1λGμ1μGvecΣ1vecΣG. Here vecΣkis the vectorization of Σk. The log-likelihood function of θfor y=y1ynis

oθ=i=1nlogfyiθ=i=1nlogk=1GλkϕyiμkΣk.E2

Direct maximization of the function (2) is complicated, and then the maximum likelihood estimate (MLE) of θis usually found via the EM algorithm .

In the setting of the EM algorithm, we regard yias incomplete data and introduce the component-label vector Zi=Zi1ZiGof zero–one indicator variables such that Zik=1indicates that yiarises from the k-th component of the mixture model and Zik=0otherwise. Assume that Zihas a multinomial distribution Mu1λwith parameter λ=λ1λG. In the mixture model, the complete data vector is xi=yizi, where yiis the observed vector and ziis the unobserved vector of Zi. Then xihas a mixture distribution with density

fxiθ=k=1GλkϕyiμkΣkzik.E3

Given x=x1xn, the log-likelihood function of θis

cθ=i=1nk=1GziklogλkϕyiμkΣk,E4

and the MLE θ̂of the function (4) is obtained from

λ̂k=i=1nzik/n,E5
μ̂k=i=1nzikxi/i=1nzik,E6
Σ̂k=i=1nzikxiμ̂kxiμ̂kT/i=1nzikE7

for k=1,,G. The EM algorithm finds θ̂by iterating the expectation step (E-step) and the maximization step (M-step). Let θtbe the t-th estimate of θin parameter space Θ. The E-step calculates the Qfunction that is the conditional expectation of cθgiven yand θtand is written as

Qθθt=Ecθyθt.E8

Mixture models treat z=z1znas missing data. The E-step calculates the conditional expectation of Zikgiven yand θt:

τikt+1=EZikyθt=PrZikyθt=λktϕyiμktΣkt/k=1GλktϕyiμktΣkt.E9

The quantity τiktis the posterior probability that yibelongs to the k-th component of the mixture. From Eq. (9), the Qfunction (8) is given by

Qθθt=i=1nk=1Gτikt+1logλkϕyiμkΣk.E10

The M-step finds θt+1maximizing the Qfunction (10) with respect to θover Θgiven θt:

θt+1=argmaxθΘQθθt.E11

When replacing zikin Eq. (5) with τikt+1in the E-step, we obtain

λkt+1=1ni=1nτikt+1.E12

From Eqs. (6) and (7), we also have

μkt+1=i=1nτikt+1xi/i=1nτikt+1,E13
Σ̂kt+1=i=1nτikt+1xiμkt+1xiμkt+1T/i=1nτikt+1.E14

We describe the EM algorithm for the normal mixture model in Algorithm 1.

Algorithm 1: The EM algorithm.

E-step:Calculate τkt+1=τi1t+1τiGt+1Tusing Eq. (9) and update τt+1=τ1t+1τnt+1.

M-step:Estimate θt+1from Eqs. (12)(14).

3. Acceleration of the EM algorithm

In order to accelerate the convergence of the EM algorithm, we can use the ε-accelerated EM algorithm  and the εR-accelerated EM algorithm . The ε-accelerated EM algorithm incorporates the vector ε(vε) algorithm  in the EM algorithm. The εR-accelerated EM algorithm improves the computation speed of the ε-accelerated EM algorithm by adding a restarting procedure.

We briefly introduce the vεalgorithm. Let θtt0be a linearly convergent vector sequence from an iterative computational procedure and converge to a stationary point θ̂as t. Then the vεalgorithm generates a sequence ψtt0that converges to θ̂faster than θtt0by using

ψt1=θt+Δθt1Δθt111,E15

where Δθt=θt+1θtand θ1=θ/θ2=θ/θθ, see Appendix A for details. The algorithm enables accelerating the convergence of a slowly convergent vector sequence and is very effective for linearly convergent sequences.

We define the EM algorithm as a mapping θMθfrom Θto Θsuch that each iteration θtθt+1is denoted by

θt+1=Mθt.E16

Algorithm 2: The ε-accelerated EM algorithm.

E-step:Estimate θt+1from Eq. (16).

εacceleration stepCalculate ψt1from θt+1θtθt1using Eq. (15).

The ε-accelerated EM algorithm is shown in Algorithm 2. Given a convergence criterion δ, the ε-accelerated EM algorithm iterates until

ψt1ψt22<δ.E17

Assume that the sequence θtt0from the EM algorithm converges to a stationary point θ̂. The εR-accelerated EM algorithm generates ψtt0converging to θ̂faster than θtt0and provides θ̂from the final value of ψtt0when the algorithm terminates.

The theorems with the convergence and acceleration of the algorithm are given in .

As shown in Algorithm 2, the ε-accelerated EM algorithm generates two parallel sequences, ψtt0in the εacceleration step and θtt0in the EM step. At the εacceleration step, the EM estimate Mψt1from ψt1may have a larger log-likelihood function than the current EM estimate θt+1, that is,

oMψt1>oθt+1.E18

When this occurs, the EM step is restarted with Mψt1as the initial value, and the εacceleration step gets ψtfrom ψt1Mψt1MMψt1. Notice that at the restarting point, we still generate the EM sequence using three estimates obtained from the same initial value ψt1. By this manner, we keep to always apply the ε-acceleration to a sequence obtained by the EM mapping Mfrom the same initial value.

By our experiments, the restarting procedure is performed almost every time when we only use the restarting condition oMψt1>oθt+1, and then it inefficiently takes much computation time. As one more condition for restarting the EM step, we give ψt1ψt22δRe>δand reset δRe=δRe/10kat each restarting, where kis an integer, such as one. By adding this condition, we can control the restarting frequency. For example, set δ=1012, and initialize δRe=1and k=1. Then the restarting procedure is performed at most 12 times.

The restarting conditions are summarized as follows:

1. oMψt1>oθt+1, and

2. ψt1ψt22<δRe.

Condition (i) means that the log-likelihood function can be increased by restarting. Condition (ii) is used to reduce the frequency of restarting. This is the key idea of the restarting procedure. The εR-accelerated EM algorithm is the ε-accelerated EM algorithm with the restarting procedure using conditions (i) and (ii) and is given in Algorithm 3.

Algorithm 3: The εR-accelerated EM algorithm.

EM step:Estimate θt+1from Eq. (16).

εacceleration step:Calculate ψt1from θt+1θtθt1using Eq. (15).

Restarting step:If oMψt1>oθt+1and ψt1ψt22<δRe, then set

θt=ψt1,E19

update

θt+1=Mψt1,E20

and reset

δRe=δRe/10k.E21

The εR-accelerated EM algorithm also gives θ̂from the final value of ψtt0. When the restarting step effectively finds values for restating the EM step, the εR-accelerated EM algorithm greatly reduces the number of iterations and computation time for convergence. The advantage of the εR-accelerated EM algorithm over the ε-accelerated EM algorithm is that it restarts the EM step at a better current estimate and also keeps that the log-likelihood function increases in the iterations.

Theoretical results of convergence and speed of convergence of the εR-accelerated EM algorithm are given in .

4. Numerical experiments for the acceleration of the EM algorithm

We investigate how much faster the ε-accelerated EM and εR-accelerated EM algorithms converge than the EM algorithm. All computations are performed with the statistical package R  executing on Windows, Intel Core i5 3.00 GHz with 8 GB of memory.

The R packageMixSim[17, 20] is used to simulate a random data matrix yhaving a p-variate normal mixture distribution of Gcomponents. We generate y=y1y1000and find the MLE of θusing the EM, ε-accelerated EM, and εR-accelerated EM algorithms. The procedure is replicated 100 times. Here, we consider p=2,3,4,5,6and G=4. For all experiments, we set δ=1012for convergence of the algorithms, δRe=1and k=1for the restarting condition of the εR-accelerated EM algorithm. Initial values of the algorithms are obtained from the k-means method using the R functionkmeans.

Tables 1 and 2 report the results of the number of iterations and CPU time of these algorithms for each p. The CPU times (in seconds) are measured by the R functionproc.timethat times are typically available to 10 milliseconds. For all computations, the acceleration algorithms found the same MLEs as those from the EM algorithm. We see from the tables that the EM algorithm requires a large number of iterations for convergence, whereas two acceleration algorithms converge a smaller number of iterations than the EM algorithm. Then the εR-accelerated EM algorithm can greatly reduce both the number of iterations and CPU time.

Min.1st Qu.MedianMean3rd Qu.Max.
p=2EM172.00467.25771.001069.481302.2510852.00
ε133.00308.50445.00697.74706.508090.00
εR83.00182.50253.50424.22396.504967.00
p=3EM210.00403.50628.50716.33946.751973.00
ε121.00276.75400.50484.83604.751566.00
εR68.00167.50244.50307.99359.751183.00
p=4EM166.00372.75468.50618.63755.752193.00
ε120.00248.75331.50400.00461.501452.00
εR58.00139.00194.50241.25291.25884.00
p=5EM141.00334.75492.50879.35783.0024886.00
ε101.00235.50351.00687.31516.0024756.00
εR57.00144.00226.00431.55336.5014288.00
p=6EM193.00361.25499.00655.80647.755910.00
ε144.00252.00323.50454.45473.755825.00
εR99.00163.75230.50302.13299.004771.00

Table 1.

Summary statistics of the number of iterations of the EM, ε-accelerated EM (ε) and εR-accelerated EM (εR) algorithms for 100 simulated random data. Each data is generated from a p-variate normal mixture distribution of four components.

Min.1st Qu.MedianMean3rd Qu.Max.
p=2EM0.391.041.682.312.8022.73
ε0.300.751.081.661.6619.18
εR0.220.490.661.111.0413.21
p=3EM0.751.402.072.643.308.53
ε0.451.011.461.992.527.60
εR0.350.681.001.441.688.26
p=4EM0.420.931.161.531.865.34
ε0.280.650.861.061.243.80
εR0.200.440.590.710.862.39
p=5EM0.250.640.921.651.5046.11
ε0.220.490.721.421.0850.36
εR0.160.350.510.950.8029.07
p=6EM0.511.021.421.841.8817.86
ε0.430.751.021.371.4717.75
εR0.320.540.760.991.0014.29

Table 2.

Summary statistics of CPU time of the EM, ε-accelerated EM (ε) and εR-accelerated EM (εR) algorithms for 100 random data. Each data is generated from a p-variate normal mixture distribution of four components.

To measure the speed of convergence of the EM and two acceleration algorithms, we calculate iteration and CPU time speedups. The iteration speedup of an acceleration algorithm for the EM algorithm is defined by

ThenumberofiterationsoftheEMalgorithmThenumberofiterationsofanaccelerationalgorithm.

The CPU time speedup is also calculated similarly to the iteration speedup. Tables 3 and 4 show the results of the iteration and CPU time speedups of two acceleration algorithms. We compare the mean values of the iteration and CPU time speedups of the algorithms. The ε-accelerated EM algorithm is about 1.5 times and 1.4 times faster than the EM algorithm in the number of iterations and CPU time, respectively. Then the εR-accelerated EM algorithm is more than twice as fast as the EM algorithm in both the number of iterations and CPU time. The boxplots of Figures 1 and 2 also show that the εR-accelerated EM algorithm is obviously much faster than the ε-accelerated EM algorithm. Table 3 and Figure 1 indicate that in 75 out of 100 replications, the number of iterations of the εR-accelerated EM algorithm is less than half as small as that of the EM algorithm. For CPU time of Table 4 and Figure 2, the εR-accelerated EM algorithm is more than twice as fast as the EM algorithm in 50 out of 100 replications.

Min.1st Qu.MedianMean3rd Qu.Max.
p=2ε1.051.341.541.611.773.58
εR1.152.082.733.033.4811.36
p=3ε1.071.321.521.521.682.15
εR1.201.972.572.582.986.08
p=4ε1.131.321.481.511.622.33
εR1.452.092.422.602.949.04
p=5ε1.011.301.461.471.632.06
εR1.331.842.232.322.674.32
p=6ε1.011.281.461.491.652.33
εR1.241.862.172.372.596.75

Table 3.

Summary statistics of the iteration speedup of the ε-accelerated EM (ε) and εR-accelerated EM (εR) algorithms for 100 random data. Each data is generated from a p-variate normal mixture distribution of four components.

Min.1st Qu.MedianMean3rd Qu.Max.
p=2ε0.971.221.451.471.673.37
εR1.051.712.242.502.858.60
p=3ε0.851.211.391.401.562.07
εR0.781.612.042.082.404.48
p=4ε1.021.271.391.431.532.11
εR1.201.702.032.172.437.48
p=5ε0.921.171.331.341.502.06
εR1.121.481.761.862.123.21
p=6ε0.841.181.391.391.552.21
εR1.001.571.771.982.245.47

Table 4.

Summary statistics of the CPU time speedup of the ε-accelerated EM (ε) and εR-accelerated EM (εR) algorithms for 100 random data. Each data is generated from p-variate normal mixture distributions of four components. Figure 1.Boxplots of the iteration speedup of theε-accelerated EM (ε) andεR-accelerated EM (εR) algorithms for 100 random data generated from ap-variate normal mixture distribution of four components. Figure 2.Boxplots of the CPU time speedup of theε-accelerated EM (ε) andεR-accelerated EM (εR) algorithms for 100 random data. Each data is generated from ap-variate normal mixture distribution of four components.

Figure 3 shows the boxplots of the iteration and CPU time speedups of the εR-accelerated EM algorithm for p=6. Here, “more” (“less”) means that the number of iterations of the EM algorithm is more (less) than the median in Tables 1 and 2. We can see from the figure that, for the larger number of iterations of the EM algorithm (“more”), the εR-accelerated EM algorithm works well to speed up the convergence of ψtt0. We observed a similar result for other p. Therefore, the algorithm is more powerful when the EM algorithm takes a larger number of iterations. Figure 3.Boxplots of the iteration and CPU time speedups of theεR-accelerated EM algorithms for 100 random data. Each data is generated from a six-variate normal mixture distribution of four components. The label “less” (“more”) means that the number of iterations of the EM algorithm is less (more) than the median inTables 1and2.

The results from the tables and figures demonstrate that the restarting step in the εR-accelerated EM algorithm enables a significant increase in the computation speed with less computational effort.

5. Initial value selection for normal mixture models

It is well known that the log-likelihood function (2) may have numerous maximums. The EM algorithm does not guarantee to obtain the global maximum of the log-likelihood function due to its local convergence. Thus, the initial value of θdeeply depends on the performance of the EM algorithm. Several methods for selecting the initial value are proposed; for example, see [14, 15, 16, 17]. These methods are based on the multiple runs of the EM algorithm using different initial values and find θ̂for getting the global maximum of the log-likelihood function.

We apply the emEM algorithm  to the mixture model estimation. The algorithm is a popular one and usually provides excellent results when the number of components is not large . The emEM algorithm selects an initial value in the em step that is several short runs of the EM algorithm using different initial values and a lax convergence criterion and obtains θ̂from the EM step that runs the EM algorithm starting from the initial value with a strict convergence criterion.

The em step consists of three steps. The first step generates Jinitial values of θ. The second step runs the EM algorithm from these initial values with a lax convergence criterion. Hence, we do not wait for convergence of the EM algorithm and stop the iterations. The third step selects the value giving the largest log-likelihood function among Jtrials.

Let δinibe a convergence criterion and Tmaxthe maximum number of iterations. We present the emEM algorithm in Algorithm 4.

Algorithm 4: The emEM algorithm.

em step:Select θ0of the EM step.

Random initialization step:Draw Jinitial values θ0jj=1,,J.

Short running step:Repeat the following computation for j=1,,J: Generate θtjjtj0by iterating the EM algorithm from θ0jand stop the iterations at the tj-iteration if

oθtjjoθtj1joθtjjoθ0j<δini,ortj>Tmax.E22

Obtain θj=θtjj.

Selection step:From Jcandidate initial values θjj=1,,J, find

θ0=argmaxθjj=1,,Joθjj=1,,J.E23

EM step:Given θ0in the em step, find θ̂using the EM algorithm.

The em step performs multiple runs of the EM algorithm, and then its computation may be time-consuming. We replace the EM algorithm with the ε-accelerated EM algorithm in the em step and use the εR-accelerated EM algorithm to obtain θ̂in the EM step. By applying these acceleration algorithms to the emEM algorithm, it is possible to reduce the number of iterations and CPU time. The acceleration of the emEM algorithm is referred as to the εem-εREM algorithm and is shown in Algorithm 5.

Algorithm 5: the εem-εREM algorithm.

ε-em step: Select θ0of the εR-EM step.

Random initialization step:Draw Jinitial values θ0jj=1,,J.

Short running step:Repeat the following computation for j=1,,J: Generate ψtjjtj0by iterating the ε-accelerated EM algorithm from θ0jand stop the iterations at the tj-iteration if

oψtjjoψtj1joψtjjoψ0j<δini,ortj>Tmax.E24

Obtain θj=ψtjj.

Selection step:From Jcandidate initial values θjj=1,,J, find

θ0=argmaxθjj=1,,Joθjj=1,,J.E25

ε-R-EM step: Given θ0in the em step, find θ̂using the εR-accelerated EM algorithm.

6. Numerical experiments for the initial value selection

We evaluate the performance of the ε-accelerated EM and εR-accelerated EM algorithms in application to the emEM algorithm.

By using MixSim, we simulate y=y1y1000having the p-variate normal mixture distribution of six components for p=2,3,4,5,6. The values of δ, δRe, and kare the same as in the experiments of Section 1.4. Assume that the probability of not finding the global maximum of the log-likelihood function in a single run is 0.80 for safety. Then the probability of finding the global maximum at least once is 10.8050>0.9999. In the em and ε-em steps, we draw 50 initial values θ0jj=1,,50fromkmeansand set δini=0.001and Tmax=1000.

Tables 5 and 6 present the number of iterations and CPU time for each p. We see from Table 5 that the number of iterations of the ε-em step is much smaller than that of the em step. The ε-accelerated EM algorithm effectively improves the computation speed of the em step. We compare the number of iterations and CPU time of the εem-εREM algorithm with those of the emEM algorithm. Then these values of the εem-εREM algorithm are about less than half of those of the emEM algorithm. The results illustrate that the ε-accelerated EM and εR-accelerated EM algorithms can sufficiently accelerate the convergence of the emEM algorithm.

emEMεem-εREM
emEMtotalε-emεR-EMtotal
p=2191238345746141514292844
p=31995149034859253541279
p=4235272530779974511448
p=53344885422915163971913
p=62641957359812344351669

Table 5.

The numbers of iterations of the emEM and εem-εREM algorithms. The em and ε-em steps generate 50 random initial values.

emEMεem-εREM
emEMtotalε-emεR-EMtotal
p=26.047.3713.414.673.227.89
p=36.363.149.503.231.004.23
p=48.811.6110.423.981.865.84
p=512.552.3314.886.041.197.23
p=611.012.4413.455.351.436.78

Table 6.

CPU times of the emEM and εem-εREM algorithms. The em and ε-em steps generate 50 random initial values.

7. Concluding remarks

In this chapter, we introduced the ε-accelerated EM and εR-accelerated EM algorithms. Both algorithms are given by very simple computational procedures and are executed with a little bit of computation for each iteration, while they well accelerate the convergence of the EM algorithm.

When the EM algorithm is applied to normal mixture models, the algorithm may converge slowly and be heavily dependent on the initial value. The first problem is solved by the acceleration of the EM algorithm. The numerical experiments indicated the availability of the ε-accelerated EM and εR-accelerated EM algorithms. For the second problem, the initial value selection is useful to initiate the EM algorithm. We applied the emEM algorithm to normal mixture model estimation and developed the εem-εREM algorithm to speed up the computation of the emEM algorithm. Then the ε-accelerated EM algorithm is used in the em step, and the εR-accelerated EM algorithm is in the EM step. Numerical experiments showed that the εem-εREM algorithm can converge in a smaller number of iterations and shorter CPU time than the emEM algorithm.

The ε-accelerated EM and εR-accelerated EM algorithms accelerate the convergence of the EM algorithm without any modification of the E- and M-steps of the algorithm. This means that these algorithms do not require to derive the acceleration formula for every statistical model. Thus, these algorithms are applied to several mixture models—mixtures of factor analyzers, mixtures of multivariate t-distributions, mixtures of generalized hyperbolic distributions, and parsimonious Gaussian mixture models. We expect that the convergence of the EM algorithms used in these mixture models tends to be slow. The results from the experiments show that the εR-accelerated EM and εR-accelerated EM algorithms are useful due to their fast speed of convergence and ease of use.

Let θtdenote a d-dimensional vector that converges to a vector θ̂as t. We define θ1=θ/θ2=θ/θθ. In general, the vεalgorithm for a sequence θtt0starts with

εt1=0,εt0=θtE26

and then generates a vector εtk+1by

εtk+1=εt+1k1+εt+1kεtk=εt+1k1+Δεtk1,k=0,1,2,.E27

For practical implementation, we apply the vεalgorithm for k=1to accelerate the convergence of θtt0. From the above equation, we have

εt2=εt+10+Δεt11fork=1,E28
εt1=εt+11+Δεt01=Δεt01fork=0.E29

Then the vector εt2becomes as follows:

εt2=εt+10+Δεt+101Δεt011=θt+1+Δθt+11Δθt11.E30

When setting ψt=εt2, we obtain Eq. (15).

chapter PDF

More

© 2021 The Author(s). Licensee IntechOpen. This chapter is distributed under the terms of the Creative Commons Attribution 3.0 License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

How to cite and reference

Cite this chapter Copy to clipboard

Masahiro Kuroda (December 8th 2021). Fast Computation of the EM Algorithm for Mixture Models [Online First], IntechOpen, DOI: 10.5772/intechopen.101249. Available from: