Computer simulations for a single harmonic frequency model

## 1. Introduction

The ultimate goal of collecting data is to gain meaningful information about a physical system. However, in many situations, the quantities that we would like to determine are different from the ones which we are able to have measured. If the data we measured depends on the quantities we want, it contains at least some information about them. Therefore, our general interest is to subtract this information from data.

Let the vector $$ contain the parameters to be estimated from the (measurements) vector $$ which is the output of the physical system that one wants to be modeled. The physical system is described by a vector function $$ in the form:

where * t* represents time. In many experiments, the recorded data $$ are sampled from an unknown function$$ together with errors $$ at discrete times$$:

The measurement errors $$ are generally assumed to be drawn independently from a zero mean Gaussian probability distribution with a standard deviation of$$. On the other hand, different signal models correspond to different choices of signal model function$$. In this chapter, we restrict our attention to the static

Static refers to that the amplitudes of the sinusoids do not change with time.

sinusoidal model given bywhere$$’s represent the amplitudes of the signal model and

The goal of data analysis is usually to use the observed data $$ to infer the values of parameters$$. Besides estimating the values of the parameters, there are two additional important problems. The one is to obtain an indication of the uncertainties associated with each parameter, i.e. some measures of how far they are away from the true parameters. The other we will not consider here is to assess whether or not the model is appropriate for explaining the data.

Sinusoidal parameter estimation in additive white noise within a Bayesian framework has been an important problem in signal processing and still now is an active research area because of its wide variety of applications in multiple disciplines such as sonar, radar, digital communications and biomedical engineering. The purpose of this research is therefore to develop accurate and computationally efficient estimators for sinusoidal parameter, namely, amplitudes and frequencies. In above problem, one may or may not know the number of sinusoids. When it is unknown, it is called model selection (Andrieu and Doucet, 1999; Üstündag, 2011) and is not subject to this chapter. Under an assumption of a known number of sinusoids, several algorithms have already been used in the parameter estimation literature, such as least-square fitting (Press et al., 1995), discrete Fourier transform (Cooley & Tukey, 1964), and periodogram (Schuster, 1905). With least square fitting, the model is completely defined and the question remaining is to find the values of the parameters by minimizing the sum of squared residuals. The discrete Fourier transform has been a very powerful tool in Bayesian spectral analysis since Cooley and Tukey introduced the fast Fourier transform (FFT) technique in 1965, followed by the rapid development of computers. In 1987, Jaynes derived periodogram directly from the principles of Bayesian inference. After his work, researchers in different branches of science have given much attention to the relationship between Bayesian inference and parameter estimation and they have done excellent works in this area for last fifteen years (Bretthorst, 1990; Üstündag et al., 1989, 1991; Harney, 2003; Gregory, 2005; Üstündag & Cevri, 2008, 2011; Üstündag, 2011).

In this chapter, we studied Bayesian recovery of sinusoids using estimation approach proposed by Bretthorst for a general signal model equation and combined it with a simulated annealing (SA) algorithm to obtain a global maximum of the posterior probability density function (PDF)$$ for frequencies. Unfortunately, conventional algorithms (Press et al., 1995) based on the gradient direction fail to converge for this problem. Even when they converge, there is no assurance that they have found the global, rather than a local maximum. This is because the logarithm of the PDF $$ is so sharply peaked and highly nonlinear function of frequencies. In this respect, a pattern search algorithm described by Hook-Jevees (Hooke & Jevees, 1962) to overcome this problem has already been used by some researchers in literature of estimation. However, we have found out that this approach does not converge unless the starting point is much closer to the optimum so that we have developed an algorithm in which this Bayesian approach is combined with a simulated annealing (SA) algorithm (Kirkpatrick, et al., 1983; Corana et al., 1987; Goffe et al., 1994; Ingber, 1996), which is a function optimization strategy based on an analogy with the creation of crystals from their melts. This explores the entire surface of the posterior PDF for the frequencies and tries to maximize it while moving both uphill and downhill steps, whose sizes are controlled by a parameter $$ that plays the role of the temperature in a physical system. By slowly lowering the temperature $$ towards zero according to a properly chosen schedule, one can show that the globally optimal solutions are approached asymptotically. Thus, it is largely independent of the starting values, often a critical input in conventional algorithms, and also offers different approach to finding parameter values of sinusoids through a directed, but random, search of the parameter space. In this context, an algorithm of this Bayesian approach is developed and coded in Mathematica programming language (Wellin at al., 2005) and also tested for recovering noisy sinusoids with multiple frequencies. Furthermore, simulation studies on synthetic data sets of a single sinusoid under a variety of signal to noise ratio (SNR) are made for a comparison of its performance with Cramér-Rao lower bound (CRLB), known as a lower limit on variance of any unbiased estimator. The simulations results support its effectiveness.

## 2. Bayesian parameter estimation

Let us now reconsider above problem within a Bayesian context (Bernardo & Smith, 2000; Bretthorst, 1988; Gregory, 2005; Harney, 2003; Jaynes, 2003; Ruanaidh & Fitzgerald, 1996; Üstündag & Cevri, 2008). As with all Bayesian calculations, the first step is to write down Bayes’ theorem for the joint PDF for all of the unknown parameters$$:

The quantity $$ is called the prior PDF; it represents prior knowledge of the parameters $$given the information$$. The sampling distribution $$ is the likelihood (Edwards, 1972) of the data $$ given the model parameters. The probability function $$ is the evidence, which is a constant for parameter estimation and is used here for normalizing the posterior PDF$$. Therefore, it can be dropped in Equation (5) so that the joint PDF for the unknown parameters turns out to be the following form:

A key component in Bayes theorem is the likelihood function $$which is proportional to the probability density of the noise. If its standard deviation $$ is assumed to be known, then the likelihood function takes on the following form:

where the exponent $$ is defined as follows

This is equivalent to

where

In order to obtain the posterior PDF for$$, Equation (6) can be integrated with respect to the nuisance parameters $$ under the knowledge of$$:

With the choices of an uniform prior PDF or independent Gaussians distributions with mean zero and known standard deviation for the amplitudes, the integral equations in (12) turn out to be a Gaussian integral which can be evaluated analytically (Bretthorst, 1988). To do this it is simply to convert the square matrix $$ into a special type of matrix- a so called diagonal matrix- that shares the same fundamental properties of the underlying matrix. In other words, it is equivalent to transforming the underlying system of equations into a special set of coordinate axes. Therefore, this diagonalization process (Bretthorst, 1988) effectively introduces new orthonormal model functions,

and also gives a new expression for the signal model function in Equation (3):

The new amplitudes$$ 's are related to the old amplitudes$$ 's by

where $$ represents the $$th component of the $$th normalized eigenvector of$$, with $$ as the corresponding eigenvalue. Substituting these expressions into Equation (9) and defining

to be the projection of the data onto the orthonormal model functions$$, we can then proceed to perform the $$ integration over$$in Equation (12) to obtain

with

This represents the mean-square of the observed projections. If $$ is known, the joint posterior probability density of the parameters$$, conditional on the data and our knowledge of $$ is given by

If there is no prior information about noise, then $$ is known as a nuisance parameter and must also be eliminated by integrating it out. Using Jeffreys prior (Jeffreys, 1961) $$and integrating Equation (12) over $$ we obtain

This has the form of the “Student * t-* distribution”. As well as determining the values of the $$ parameters for which the posterior PDF for $$ is a maximum, it is also desirable to compute uncertainties associated with them. To do this, let us assume the case where $$ is known and let $$ represent the estimated values of$$. Following to Bretthorst’s work (Bretthorst, 1988), we can expand the function $$ in a Taylor series at the point$$, such that

with $$defined as $$ and $$for a single frequency case. For an arbitrary model the matrix $$ cannot be calculated analytically; however, it can be evaluated numerically. The calculations of the mean and standard deviations for $$ parameters require for evaluating Gaussian integrals by first changing to the orthogonal variables as was done above with the amplitudes. Let $$and $$ represent the $$ eigenvalue and eigenvector of the matrix$$, respectively. Then the new orthogonal variables are given by

By using these orthogonal variables to perform the $$ Gaussian integrals, the estimate variance for $$ can be calculated:

Therefore, the approximations for $$ can be implemented in the form:

In agree with Bretthorst, the expectation values of the amplitudes are given by$$. From Equation (15) the expected values for the old amplitudes $$becomes

The uncertainty in $$ is$$, so that the corresponding uncertainty in $$ is

## 3. Implementation of Simulated Annealing (SA) algorithm

Bayesian approach introduced by Bretthorst is briefly summarized in section 2 but, it is referred to Bretthorst's works (Bretthorst, 1988) for more detail information. Consequently, Bayesian parameter estimation turns into a global optimization problem which is a task to find the best possible solution $$ for Equations (19) or (20) satisfying

in the solution space$$.Because there is no variation at negative frequencies and the highest possible frequencies corresponds to wave that under goes a complete cycle in two unit intervals, so the lower limit on the range is 0 and all the variation is accounted for by frequencies less than$$.

Over last few decades, researchers have developed many computational algorithms to address such type of global optimization problems (Metropolis et al., 1953; Jeffreys, 1961; Kirkpatrick, et al., 1983; Laarhoven & Aarts, 1987; Stefankovic et al., 2009). Although there are numerous algorithms which are suggested to achieve this goal, few of them are capable of locating it effectively. Therefore, we follow the SA algorithm, suggested by Corana (Corana et. al., 1987) and modified by Goffe (Goffe et al., 1994), which is a kind of probabilistic algorithm for finding the global optimum of Equation (27) although its various alternative versions have already been used in statistical applications. A brief review of the most work on many algorithms based on that of SA, together with areas of applications is provided by Ingber and Binder (Binder, 1986; Ingber, 1994).

The algorithm begins with an initial guess of the frequencies$$, a trial step-length vector $$and a global parameter $$ (called the initial temperature). Each step of the SA algorithm replaces the current frequency with randomly generated new frequency. In other words, the next candidate point $$ is generated by varying one component $$of the current point $$ at a time:

where $$ is a uniformly distributed random number from the interval [-1,1] and $$ is a step vector. The function value of $$is then computed. If

then the point $$ is accepted as the $$th iteration point, it is replaced with $$and algorithm moves uphill. If $$ is the largest posterior PDF, denoted as$$its value and $$ are recorded since this is the best current value of the optimum. This forces the system toward a state corresponding to a local maximum or possibly a global maximum. However, most large optimization problems, like the one given in Equation (27), have many local maxima and optimization algorithm is therefore often trapped in a local maximum. To get out of a local maximum, a decrease of the function value $$ is accepted with a certain probability. This is accomplished by the Metropolis-criterion (Metropolis et al., 1953) which is based on the changes of obtaining new state with the posterior PDF of frequencies value, defined as

where $$represents difference between the present and previous posterior PDF values of frequencies, e.g.,$$. Whenever

where $$ is the number of accepted moves along the direction $$ and the parameter$$, which is initially defined by user, controls the step variation along the $$th direction. The aim of these variations in a step length is to maintain average percentage of accepted moves at about one-half of the total number of moves.

An alternative step size of above SA algorithm is given by replacing Equation (28) with

where $$ is a standard Normal distribution, $$represents the temperature at the $$th iteration and rescales the step size and ordering$$. On the other hand, $$is the CRLB for the $$th component of angular frequencies at the $$th iteration (Ireland, 2007) and provides a theoretical lower limit on how precisely parameter $$can be extracted from noisy measurements. In this respect, it is defined in the form:

where the Fisher information matrix $$(Kay, 1993), defined by

is an expectation of the second derivatives of the signal function with respect to$$. Assuming that the matrix $$ is diagonal for a large$$ so that its inversion is straightforward. In this case, the diagonal elements yield the lower bound (Stoica et al., 1989; Lagrange, 2005) for the variance of $$ asymptotically and we can write,

where $$ represents the estimated variance of the noise and is described in Equation (41). This whole cycle is then repeated $$ times, after which the temperature is decreased by a factor$$. This process is generally called annealing (or cooling) schedule which is the heart of the algorithm and effects the number of times the temperature is decreased. If a fast cooling takes place then the problem will be trapped in a local maximum. Therefore, there are various annealing schedules suggested by different researchers (Ingber, 1994; Stefankovic, et al., 2009) for lowering the temperature but we choose the following:

Because of being exponential rather than logarithmic, it is sometimes known as simulated quenching (SQ) (Vasan et al., 2009; Aguiare et al., 2012). In case of a well conditioned estimation problem like, say, frequency estimation problem in signal processing, it is clear that the convenience of SA algorithm, together with a need for some global search over local optima, makes a strong practical case for the use of SQ. Therefore, different parameters have different finite ranges and different annealing time- dependent sensitivities. Classical annealing algorithms have distributions that sample infinite ranges and there is no decision for considering differences in each parameter dimension; e.g. different sensitivities might be necessary to use different annealing schedules. This requires the development of a new PDF to embed in the desired features (Ingber, 1994) so that it leads to variants of SA that justifies exponential temperature annealing schedules.

Termination of the algorithm occurs when average function value of the sequences of the points after each $$ step cycle reaches a stable state:

where$$is a small positive number defined by user and $$ indicates the last four successive iteration values of the posterior PDF of the frequencies that are being stored. Further details of the algorithm initialization are problem-dependent and are given in Section 5.

## 4. Power spectral density

Before we discuss the computer simulated examples, there is something we need to say about how to display the results. The usual way the result from a spectral analysis is displayed is in the form of a power spectral density that shows the strength of the variation (energy) as a function of frequency. In Fourier transform spectroscopy this is typically taken as the squared magnitude of the discrete Fourier transform of the data. In order to display our results in the form of a power spectral density (Bretthorst, 1988; Gregory, 2005), it is necessary to give an attention to its definition that shows how much power is contained in a unit frequency. According to Bretthorst (Bretthorst, 1988) the Bayesian power spectral density is defined as the expected value of the power of the signals over the joint posterior PDF:

Performing integrals analytically over $$ by using these orthonormal model functions defined in section 2, the power spectral density can therefore be approximated as

This function stresses information about the total energy carried by the signal and about the accuracy of each line. In the next section, we will present some numerical examples how well this technique works.

## 5. Computer simulated examples

To verify the performance of the algorithm, we generated a simulated data vector according to one, two and with five sinusoids. Here $$ runs over the symmetric time interval $$ to $$ in $$ integer steps and the components of the noise$$ are generated from the zero mean Gaussian distribution with a known deviation$$, initially and added to the simulated data. However, one of the interests in an experiment is also to estimate noise power$$ so that it is assumed to be unknown. In agreement with Bretthorst, this is given in the following form:

Clearly, it is seen that the accuracy of the estimate depends on how long we sample and the signal-to-noise ratio (SNR), defined as the ratio of the root mean square of the signal amplitude to the noise$$. In addition, one may also get the following expression of SNR:

When the standard deviation of the noise is unknown, an empirical SNR is obtained by replacing $$ in Equation (42) with the estimated noise variance in (41).

In our first example, we generate the data set from the following equation:

We then carried out the Bayesian analysis of the simulated data, assuming that we know the mathematical form of the model but not the value of the parameters. We first gave starting values to the list of frequencies to begin a multidimensional search for finding a global maximum of the posterior PDF of the frequencies $$ given in Equations (19) or (20). As an initial estimate of the frequencies $$ for the maximization procedure, it is possible to take random choices from the interval$$. However, it is better to start with the locations of the peaks chosen automatically from the Fourier power spectral density graph by using a computer code written in Mathematica.

In agreement with Corana (Corana, et al., 1987), reasonable values of the parameters that control the SA algorithm are chosen as$$,$$ and$$. Then the global optimization algorithm starts at some high temperature$$. Thus the sequence of points is generated until a sort of equilibrium is approached; that is a sequence of points $$ whose average value of $$ reaches a stable value as iteration proceeds. During this phase the step vector $$ is periodically adjusted by the rule defined in Equation (32). The best point $$ reached so far is recorded. After thermal equilibration, the temperature $$ is reduced by using the annealing scheduled in Equation (37) with a factor $$and a new sequence is made starting from this best point$$, until thermal equilibrium is reached again and so on. The SA algorithm first builds up a rough view of the surface by moving large step lengths. As the temperature $$ falls and the step decreases, it is slowly focuses on the most promising area. Therefore it proceeds toward better maximum even in the presence of many local maxima by adjusting the step vector that can allow the algorithm to shape the space within it may move, to that within which ought to be as defined by the PDF$$. Consequently, the process is stop at a temperature low enough that no more useful improvement can be expected, according to a stopping criterion in Equation (38).

Once the frequencies are estimated, we then carried on calculating the amplitudes and parameter errors approximately using Equations (25), (23) and (26), respectively. However, an evaluation of the posterior PDF at a given point $$ cannot be made analytically. It requires a numerical calculation of projections onto orthonormal model functions, related to Eigen-decomposition of the $$ dimensional matrix$$. Therefore, the proposed algorithm was coded in * Mathematica* programming language (Wellin, P., et al., 2005), that provides a much flexible and efficient computer programming environment. Furthermore, it also contains a large collection of built-in functions so that it results much shorter computer codes than those written in C or FORTRAN programming languages.

The computer program was run on the workstation with four processors, which of each has got Intel Core 2 Quad Central Processing Unit (CPU), in two cases where the standard deviation of noise is known or not. The output of the computer simulations when $$ is illustrated in Table 1. The results when $$ is unknown are almost similar with that of Table 1. Parameter values are quoted as (* value*) ± (

*). It can be seen that a single frequency and amplitudes are recovered very well. The estimated value of SNR and the standard deviation of the noise are also shown in Table 1.*standard deviation

In our second example, we consider a signal model with two close harmonic frequencies:

In a similar way, we produced the same size data corrupted by the zero mean Gaussian noise with$$. We run our * Mathematica* code again in the case where the deviation of noise is unknown. The results, shown in Table 2, indicate that all values of the parameters within the calculated accuracy are clearly recovered.

In general, we consider a multiple harmonic frequency model signal:

The best estimates of parameters are tabulated in Table 3. Once again, all the frequencies have been well resolved, even the third and fourth frequencies which are too closed not to be separated by the Fourier power spectral density shown in Figure 3. Actually with the Fourier spectral density when the separation of two frequencies is less than the Nyquist step, defined as$$, the two frequencies are indistinguishable. This is simply because there are no sample points in between the two frequencies in the frequency domain. If $$ theoretically the two frequencies can then be distinguished. If $$is not large enough, the resolution will be very poor. Therefore, it is hard to tell where the two frequencies are located. This is just the inherent problem of the discrete Fourier power spectral density. In this example two frequencies are separated by 0.01, which is less than the Nyquist step size. There is no way by using Fourier power spectral density that one can resolve the closed frequencies less than the Nyquist step. However, Bayesian power spectral density shown in Figure 3 gives us very good results with high accuracy. Finally, we constructed the signal model in Equation (3), whose parameters, amplitudes and frequencies, are randomly chosen from uniform distribution in the intervals $$ and$$, respectively and used it to generate data samples of $$ by adding a zero mean Gaussian random noise with$$. The proposal algorithm was rerun for recovering sinusoids from it and the results are tabulated in Table (4). It can be seen that frequencies are specially recovered with high accuracies. Ten frequencies signal model are shown in Fig.5.

In order to compare the results with Bretthorst's in this example we converted

On the other hand, modifications of this algorithm (Üstündag & Cevri, 2008; 2011) have already been made by generating the next candidate in Equation (33) from normal distribution with a mean of the current estimation whose standard deviation is a square root of the CRLB (Ireland, 2007) given in Equation (36), which is a lower limit to the variance of the measurement of the frequencies, so this generates a natural scale size of the search space around their estimated values. It is expected that better solutions lie close the ones that are already good and so normally distributed step size is used. Consequently, the results we obtained are comparable with or higher than those obtained previously. In addition, all the results discussed so far are also consistent with those of Bretthorst (Bretthorst, 1988) and also demonstrate the advantages of the Bayesian approach together with SA algorithm. Moreover it appears to be very reliable, in the sense that it always converged to neighborhood of the global maximum. The size of this neighborhood can be reduced by altering the control parameters of the SA algorithm, but this can be expensive in terms of CPU consumption. Moreover, we initially assumed that the values of the random noise in data were drawn from the Gaussian density with the mean$$ and the standard deviation $$ Figure 4 shows the exact and estimate probability densities of the random noise in data. It is seen that the estimated (dotted) probability density is closer to the true (solid) probability density and the histogram of the data is also much closer to the true probability density. The histogram is known as a nonparametric estimator of the probability density because it does not depend on specified parameters.

Computer simulations had been carried out to compare the performance of the method with the CRLB. To do this, we generated 64 data samples from a single real tone frequency signal model$$ and added it to the variety of noise levels. After 50 independent trials under different SNR ratios, the mean square errors (MSE) for the estimated frequencies were obtained and their logarithmic values were plotted with respect to SNR ratios that vary between zero and $$dB (deci Bell). It can be seen from Figure 6 that the proposed estimator has threshold about 3 dB of the SNR and follows nicely the CRLB after this value. As expected, larger SNR gives smaller MSE. However, many of existing methods in signal processing literature have a MSE that is close to the CRLB when the SNR is more than 20 dB and they usually perform poorly when the SNR is decreased.

The computational complexity of the algorithm is dependent upon a few parameters such as annealing schedule, data samples and parameters. Under using same annealing schedule, Fig. 7 shows only CPU time of different simulations in a variety of number of data samples. It can be clearly seen that an increase in these numbers causes larger consumption of CPU time. With fixed size of components set and specifically annealing schedule of SA algorithm, the overall execution time of the cooling and decision is almost constant, but the runtime of the first two stages (move and evaluate) mostly depends on complicated design constraints and objective functions. Because the move and the evaluation process in the SA algorithm play an important role in CPU resource usage, improving the calculation ability for these stages will be the most feasible approach for an optimizing SA so that parallel computing is one of best approaches for this goal.

## 6. Conclusion

In this work we have partially developed a Bayesian approach combined with SA algorithm and applied it to spectral analysis and parameter algorithm for estimating parameters of sinusoids from noisy data. Overall, results presented here show that it provides rational approach for estimating the parameters, namely the frequencies and the amplitudes, can be recovered from the experimental data and the prior information with high accuracy, especially the frequency, which is the most important parameter in spectral analysis. A significant advantage of this approach comes from the very large posterior probabilities, which are sharply peaked in the neighborhood of the best fit. This helps us to simplify the problem of choosing starting values for the iteration and it provides a rational approach for estimating, in an optimal way, values of parameters by performing a random search. On the other hand, for sufficiently high SNR, MSEs of frequencies will attain CRLB so that it justifies the accuracy of the frequency estimation. Although the SA algorithm spends large consumption of CPU time, it is competitive when compared to the multiple runs often used with conventional algorithms to test different starting values. As expected, parallel implementation of SA algorithm reduces CPU resource usage.

Data analysis given in this chapter has also been applied to more complicated models and conditions, such as signals decay, periodic but non-harmonic signals, signals with non-stationary, etc.,. In general, we have also not addressed the problem of model selection which is the part of spectral analysis. In this case, one has enough prior information in a given experiment to select the best model among a finite set of model functions so that Bayesian inference helps us to accomplish it. Therefore, it will deserve further investigations.

### Nomenclature

$$: Nuisance parameters

$$: Amplitudes of $$sinusoidal

$$: Marginal probability density function

$$: Conditional probability density function

$$: A matrix defined as $$

$$: A parameter which controls the step variation along the $$th direction

$$: A set of observed data

$$: Measurement errors

$$: A small positive number defined by user

$$: Expectation of a function or random variable

$$: Model functions that contains sinus and cosines terms

$$: Projections of data onto orthonormal model functions

$$: Orthonormal model functions

$$: Mean of square of projections

$$: Prior information

$$: Fisher information matrix

$$: Inverse of Fisher information matrix

$$: Number of frequencies.

$$: Number of data samples

$$A standard Normal distribution

$$: The number of accepted moves along the direction$$.

$$: Number of iterations

$$: Least square function

$$: Bayesian power spectral density

$$: Sampling variance

$$: Discrete time set

$$: Standard deviation of angular frequency

$$: Standard deviation of amplitude

$$: Solution space of angular frequencies

$$: $$Eigenvalue of the matrix$$

$$: $$Eigenvector of the matrix$$

$$: A factor between 0 and 1.

$$: Parameters vector

$$:$$ th component of normalized Eigenvalues of$$

$$: $$th component of $$th normalized Eigenvector of$$

$$: Step-length vector

$$: Trial step-length vector

$$: Uniformly distributed random number from the interval $$

$$: Variance of noise

$$: Expected value of noise variance

$$: CRLB for $$th component of $$

$$: Vector of angular frequency

$$: Estimated angular frequency vector

$$: An initial guess vector for frequencies

$$: The initial temperature

$$: Diagonal matrix

$$: Absolute value

CRLB: Cramér-Rao lower bound

PDF: Probability Density Function

SA : Simulated Annealing

SNR : Signal to Noise Ratio

SQ : Simulated Quenching

MSE: Mean Squared Errors

dB : deci Bell

### Acknowledgement

This work is a part of the projects, whose names are "Bayesian Model Selection and Parameter Estimation" with a number FEN - DKR - 200407 - 0082 and "Bayesian Spectrum Estimation of Harmonic Signals" with a number FEN-BGS-060907-0191, supported by the Marmara University, Istanbul, Turkey.

## References

- 1.
Aguiare H. Junior O. Ingber L. Petraglia A. Petraglia M. R. Machado M. A. S. 2012 Stochastic Global Optimization and Its Applications with Fuzzy Adaptive Simulated Annealing. Springer Heidelberg New York Dorddrecht London,36 57 - 2.
Barbedo J. G. A. Lopes A. 2009 Estimating Frequency, Amplitude and Phase of two Sinusoids with Very Close Frequencies, International Journal of Signal processing,5 2 138 145 - 3.
Bernardo J. M. Smith A. F. M. 2000 Bayesian Theory, Wiley, New York. - 4.
Binder K. 1986 Monte Carlo Methods in Statistical Physics, 2^{nd}Edition, Springer-Verlag. - 5.
Bretthorst G. L. 1988 Bayesian Spectrum Analysis and Parameter Estimation, Lecture Notes in Statistics, Springer-Verlag Berlin Heidelberg, New York. - 6.
Bretthorst G. L. 1990 An Introduction to Parameter Estimation Using Bayesian Probability Theory, Maximum Entropy and Bayesian Methods, Kluwer Academic Publishers,53 79 - 7.
Brooks S.P. & Morgan B.J.T. 1995 Optimization using simulated annealing, Journal of Royal Society (The Statistician) 44,241 257 - 8.
Chan K. W. So H. C. 2004 Accurate frequency estimation for real harmonic sinusoids, IEEE Signal Processing Letters 11,609 612 - 9.
Cooley J. W. Tukey J. W. 1964 An algorithm for the machine calculation of complex Fourier series, Mathematics of Computation 19,297 301 - 10.
Corana A. Marchesi M. Martini C. Ridella S. 1987 Minimizing multimodal functions of continuous variables with the simulated annealing algorithm, ACM Transactions on Mathematical Software 13,262 280 - 11.
Edwards A. W. F. 1972 Likelihood, Cambridge University Press. - 12.
Gregory P. C. 2005 Bayesian Logical Data Analysis for the Physical Science, Cambridge University Press. - 13.
Goffe W. L. Ferier G. D. Rogers J. 1994 Global optimization of statistical functions with simulated annealing, Journal of Econometrics 60,65 100 - 14.
Harney H. L. 2003 Bayesian Inference: Parameter Estimation and Decisions, Springer-Verlag Berlin Heidelberg. - 15.
Hooke T. R. Jevees T. A. 1962 Direct search solution of numerical and statistical problems, Journal of Association of Computer Machinery 5,212 229 - 16.
Ingber L. 1994 Simulated Annealing: Practice versus theory, Mathematical computer Modeling 18,29 57 - 17.
Ingber L. 1996 Adaptive Simulated Annealing ASA: Lessons learned, Control and Cybernetic 25,33 54 - 18.
Ireland J. 2007 Simulated annealing and Bayesian posterior distribution analysis applied to spectral emission line fitting, Journal of Solar Physics 243,237 252 - 19.
Jaynes E. T. 1987 Bayesian Spectrum and Chirp Analysis, in Maximum Entropy and Bayesian Spectral Analysis and Estimation Problems, C. R. Smith and G. J. Erickson (eds.), D. Reidel, Dordrecht,1 37 - 20.
Jaynes E. T. 2003 Probability Theory: The Logic of Science, Cambridge University Press. - 21.
Jeffreys H. 1961 Theory of Probability, Oxford University Press, London. - 22.
Kay S. M. 1993 Fundamentals of statistical signal processing: Estimation theory, Prentice Hall,1 - 23.
Kirkpatrick S. Gelatt C. D. Vecchi M. P. 1983 Optimization by Simulated Annealing, Journal of Science 220,671 680 - 24.
Laarhoven P. V. Aarts E. 1987 Simulated annealing: Theory and Applications, D. Reidel Publishing Company. - 25.
Locatelli M. 2000 Simulated annealing algorithms for Continuous global optimization: Convergence Conditions, Journal of Optimization Theory and Applications 106,121 133 - 26.
Lagrange M. Marchand S. Rault-Rr J. 2005 Improving sinusoidal frequency estimation using a trigonometric approach, Proceedings of the Digital Audio Effects Conference, Spain,110 115 - 27.
Metropolis N. Rosenbluth A. Rosenbluth M. Teller A. M. Teller E. . 1953 Equation of states calculations by fast computing machines, Journal of Chemical Physics 21,1087 1092 - 28.
Mc Whorter T. Scharf L. L. 1993 Cramer-Rao Bound for deterministic modal analysis, IEEE Transactions on Signal Processing 41,1847 1866 - 29.
Nishiyama K. 1997 A nonlinear filter for estimating a sinusoidal signal and its parameters in white noise: On the case of a single sinusoid, IEEE transactions on signal processing 45,970 981 - 30.
Norton J. P. 1986 An introduction to identification, Academic Press, 1986. - 31.
Press W. H. Flannery B. P. Teukolshy S. A. Vetterling W. T. 1995 Numerical recipes in C: The art of computing, Cambridge University Press. - 32.
Ruanaidh J. J. K. Q. Fitzgerald W. J. 1996 Numerical Bayesian Methods Applied to Signal Processing, Springer-Verlag, New York. - 33.
Stoica P. Nehorai A. 1989 MUSIC, Maximum Likelihood, and Cramer-Rao Bound, IEEE Transactions on Acoustic Speech, and Signal Processing 37,720 741 - 34.
Schuster A. 1905 The Periodogram and its optical analogy, Proceedings of the Royal Society of London, Series A 77,136 140 - 35.
Stefankovic D. Vempala S. Vigoda E. 2009 Adaptive simulated annealing: A near- optimal connection between sampling and counting. Journal of the ACM 56,doi:10.1145/1516512.1516520 - 36.
Tsoulos G. Lagaris I. E. 2006 GenAnneal: Genetically modified Simulated Annealing, Computer Physics Communications 174,846 885 - 37.
Üstündag D. Queen Bowcock J. E. 1989 A method for retrieving images from noisy, incomplete data, Proceeding of the Fifth Alvey Vision Conference, University of Reading,239 245 - 38.
Üstündag D. Queen N. M. Skinner G. K. Bowcock J. E. 1991 Two new methods for retrieving an image from noisy, incomplete data and comparison with the Cambridge MaxEnt package, 10. International Workshop on Maximum Entropy and Bayesian Methods, MaxEnt 90,295 301 - 39.
Üstündağ D. Cevri M. 2008 Estimating parameters of sinusoids from noisy data using Bayesian inference with simulated annealing, WSEAS Transactions on Signal Processing 7,432 441 - 40.
Üstündağ D. Cevri M. 2011 Recovering sinusoids from noisy data using Bayesian inference with simulated annealing, Mathematical and Computational Applications 16,382 391 - 41.
Üstündağ D. 2011 Recovering sinusoids from data using Bayesian inference with RJMCMC, Proceeding of Seventh International Computation on Natural Computation (ICNN), Shanghai,1850 1854 - 42.
Vasan A. Raju K. S. 2009 Comparative analysis of Simulated annealing, Simulated quenching and genetic algorithms for optimal reservoir operation, Applied Soft Computing 9,274 281 - 43.
Wellin P. Gayllord R. Kamin S. 2005 An Introduction to programming with Mathematica, Cambridge University Press.

## Notes

- Static refers to that the amplitudes of the sinusoids do not change with time.
- In order to compare the results with Bretthorst's in this example we converted