Open access peer-reviewed chapter

# Methods of Estimating Uncertainty of Climate Prediction and Climate Change Projection

By Youmin Tang, Dake Chen, Dejian Yang and Tao Lian

Submitted: December 13th 2011Reviewed: November 2nd 2012Published: January 16th 2013

DOI: 10.5772/54810

## 1. Introduction

A critical issue in climate prediction and climate change projection is to estimate their uncertainty. The estimation of uncertainty has been an intensive research field in recent years, which is also called the potential predictability study. The terminology of the uncertainty of prediction and the potential predictability are often alternatively used in literature due to their inherent linkage, although they have some difference in a rigorous framework of predictability theory. For example, when a system has a high potential predictability, we may think the uncertainty of its predictions to be small, and vice versa. In this chapter, unless otherwise indicated, the uncertainty of prediction and potential predictability have the similar meaning in describing and measuring the prediction utility, and are thus used alternatively. For simplicity, we also often use the term of predictability to denote the potential predictability.

The uncertainty of prediction or predictability study is usually conducted using the strategy of ensemble prediction, from which there are a couple of metrics to quantify the potential predictability. Among them are variance-based measure and information-based measure, both quantifying the predictability or prediction uncertainty from different perspectives. In this chapter, we will introduce the two kinds of metrics. Emphasis will be placed on the similarity and disparity of these measures, and the realistic applications of the measures in studying the uncertainty of climate prediction and climate change projection. It should be noted that these potential predictability metrics do not make use of observation, which is essentially different from the actual prediction skills measured against observations like correlation skill or root mean square of errors (RMSE).

## 2. Two methods of measuring potential predictability

### 2.1. Signal-to-Noise Ratio (SNR) and potential predictability

The SNR has been a widely used measure of potential predictability [1, 2]. At seasonal time scale, the signal is usually regarded as the atmospheric responses to the slowly varying external forcing such as sea surface temperature (SST), sea ice, snow cover, etc., whereas the noise is induced by the relatively high frequency atmospheric variability such as weather processes. In an ensemble seasonal climate prediction, the amplitude of signal and noise can be approximately quantified by the variance of ensemble mean and the averaged ensemble spread over all initial conditions [3-5], namely,

Var(S)=1Mi=1M(Xi¯X¯¯)2E1
Var(N)=1MKi=1Mj=1K(Xi,jX¯i)2E2

where Xi,jis the j-th member of the ensemble prediction starting from the i-thinitial condition. The Xitself can be a scalar such as an index or a two dimensional field. K is the ensemble size and M is the total number of initial conditions (predictions); and X¯i=1Kj=1KXi,j,X¯¯=1Mi=1MX¯i.

Considering the sampling errors in estimating signal variance, the more accurate estimation of signal variance S is modified as below:

Var(S)=1Mi=1M(Xi¯X¯¯)21KVar(N)E3

Two common measures of potentially predictability are the signal-to-noise ratio (SNR) and the signal-to-total variance ratio (STR), i.e.,

SNR=Var(S)Var(N),    STR=Var(S)Var(S)+Var(N)E4

It can be derived that the square root of STRis equivalent to the correlation of the signal component (S) to the prediction target itself. Thus, theSTRis often defined as potential correlation (PCORR).

It is easy to derive that the STRis actually a perfect correlation skill, which assumes that the observation is an arbitrary ensemble member. The perfect correlation skill ignores the imperfectness of model itself. To see this equality, we denote the ensemble mean μas the prediction, thus the ‘observation’ can be written by μ+ε, where the εis a normally distributed white noise with the mean of zero and variance ofσe2.

The correlation of prediction against the ‘observation’ can be written as follows:

Corrpef=E[μ(μ+ε)]{E(μ2)E[(μ+ε)2]}1/2=E(μ2)E(μ)2E[μ2+2με+ε2]=E(μ2)E(μ2)+E(ε2)  E5

Comparison between (3) and (4) reveals the equality of CorrpefandSTR.

### 2.2. Information-based potential predictability

#### 2.2.1. Relative Entropy and predictive information

Entropy is a measure of dispersion level (e.g. uncertainty). The entropy of a continuous distribution p(x)is defined as

H(x)=p(x)lnp(x)dx,E6

where the integral is understood to be a multiple integral over the domain of x. Larger entropy is associated with smaller probability and larger uncertainty.

The information-based potential predictability measures include relative entropy (RE), predictive information (PI) and predictive power (PP). The central idea of these information-based measures is that the difference between two probability distributions: the forecast distribution and climatology distribution, quantifies the extra information brought from the prediction.

Suppose that the future state of a climate variable is predicted/modeled as a random variable denoted by νwith a climatological distribution p(ν). One ensemble prediction produces a forecast distribution which is the conditional distribution p(ν|i)given the initial condition i. The climatological distribution is also the unconditional distribution and we have

p(ν)=p(ν| i)p(i)di,E7

where the p(i)is the probability distribution of the initial condition i. Usually various statistical tests are used to examine the difference between two distributions [6-7]. Relative entropy RE, or Kullback-Leibler distance, is a quantitative measure of the difference between two distributions from information theory [8]. In the context of predictability, it is defined as

RE=p(ν|i)lnp(ν|i)p(ν)dν.E8

In terms of information theory, the quantity RE measures the informational inefficiency of using the climatological distribution p(ν)rather than the forecast distribution p(ν|i)andRE0with the equality if and only ifp(ν|i)=p(ν)[ 8]. In Bayesian terminology, the climatological distribution is a priordistribution which can be usually derived from the long term historical observations. An ensemble prediction augments this prior information, and the additional information measured by RE is a natural measure of the utility or usefulness of this prediction and thus implies the potential predictability. In practice, p(ν|i)and p(ν)can be estimated directly from samples or approximated alternatively using kernel density estimation.

Another natural measure of predictability is the predictive information (PI), defined as the difference between the entropy of the climatological and forecast distributions:

PI=H(v)H(v|i)E9

Considering (7), then

PI=p(v)ln[p(v)]dv+p(v|i)ln[p(v|i)]dvE10

The first term on the right hand side of Eq. (8) denotes the entropy of the prior distribution p(v)(climatological distribution), measuring the uncertainty of a prior time when no extra information is provided from the observed initial condition and forecast model; whereas the second term represents the entropy of the posterior distribution p(v|i)(forecast distribution), measuring the uncertainty after the observed initial condition and subsequent prediction becomes available (An elaborated illustration can be found in [9]). Thus a large PI indicates that the posterior uncertainty will decrease because of useful information being provided by a prediction (e.g., the largerp(v|i)the smaller uncertainty) that is, the prediction is to be more reliable in a “perfect model” context.

The predictive power (PP) was defined by [10]

PP=1exp(PI)E11

In the case where the PDFs are Gaussian distributions, which is a good approximation in many practical cases (including ENSO prediction, e.g., [11]). The predictive and climatological variances, and the difference between their means. The resulting analytical expression for the relative entropy RE, PI and PIare given as follows [1]:

RE=12{ln[det(Σq2)det(Σp2)]+trace[Σp2(Σq2)1]+(μpμq)T(Σq2)1(μpμq)n}E12
PI=12ln(det(Σq2)det(Σp2))E13
PP=1(det(Σq2)det(Σp2))12E14

where, qand Pare the climatological and predictive covariance matrices respectively;detis the determinant operator and tris the trace operator; μqand μpare the climatological and predictive mean state vectors of the system, and nis the number of degree of freedom;. REis composed of two components: (i) a reduction in climatological uncertainty by the prediction [the first two terms plus the last term on the right-hand side of (10)] and (ii) a difference in the predictive and climatological means [the third term on the rhs of (10)]. These components can be interpreted respectively as the dispersion and signal components of the utility of a prediction[12]. A large value of REindicates that more information that is different from the climatological distribution is being supplied by the prediction, which could be interpreted as making it more reliable [1]. A key difference between relative entropy (RE) and predictive information (PI) is that RE vanishes if and only if the forecast and climatological distributions are identical (i.e., same mean and spread), while PI is zero as long as the two distributions have the same spread [9]. Remarkably, predictive information and relative entropy are invariant with respect to linear invertible transformations of the state [9-10].

For a scalar variable (e.g., an index), RE, PI, and PP can be simplified as

PI=12ln(σq2σp2)E15
RE=12[ln(σq2σp2)+σp2σq21_+μp2σq2__]=PI+12[σp2σq21+μp2σq2]DispersionSignalE16
PP=1(σp2σq2)1/2E17

#### 2.2.2. Mutual information

RE or PI is a predictability measure for individual predictions. The average of REs or PIs over all initial conditions reflects the average predictability and was proved to be equal to mutual information (MI), another quantity from information theory [9]. In the context of predictability, MI is defined as [9]

MI=p(v,i)ln[p(v,i)p(v)p(i)]dvdiE18

wherep(ν,i)is the joint probability distribution between νand i. MI measures the statistical dependence between νand i, and vanishes when νand iare independent (p(ν,i)=p(ν)p(i)).The equality of MI and average RE indicates that predictability can be measured in two equivalent ways: by the difference between forecast and climatological distributions or by the degree of statistical dependence between the initial condition iand the future stateν[ 13]. If the future stateνis on average unpredictable, individual forecasts should have the probability distribution identical to the climatological distribution, i.e.,p(ν|i)=p(ν)and RE =0 for all predictions. This is equivalent to independence between iand ν. Therefore, independence indicates unpredictability and dependence implies predictability. MI is invariant with respect to nonlinear, invertible (nonsingular) transformations of state[9]. Thus, the MI between νand iequals to the MI between νand ensemble mean μν|i. The latter is probably more straightforward in understanding MI-based predictability since the dependence between νand ensemble mean μν|ican be interpreted as the dependence between observation (ν) and prediction (μν|i) under the assumption of a perfect model.

When forecast and climatological distributions are Gaussian, MI can be expressed, using (13), by[13]

MI=12(lnσν2lnσν|i2)E19

Eq. (17) is the formula often used to calculate MI. Joe[14] and DelSole[15] showed that the transformations 1exp(2MI)and 1exp(2MI)produce “potential” skill scores which exhibit proper limiting behavior: they have values between 0 and 1, and the minimum (maximum) value 0 (1) occurs when MI vanishes (approaches infinite). Here “potential” indicates that they are perfect model measures. In this study, we will use the two “potential” skills to represent MI. Furthermore, if the forecast and climatological distributions are Gaussian, and forecast variance is constant, the above two “potential” skills respectively reduce to another two conventional “potential” skills: “potential” anomaly correlation (ACp) and “potential” mean square skill score (MSSSp)[9,13].

ACp=1exp(2MI),E20
MSSSp=ACp2=1exp(2MI)E21
(19)

### 2.3. Relationship between SNR-based metrics and MI-based metrics

The averaged RE and PI (RE¯and PI¯) over all predictions (initial conditions) are identical to MI, as mentioned before. For seasonal climate prediction, the total variance (i.e., climate variance) can be decomposed into signal (S) variance and noise (N) variance, if the signal and noise are assumed to be independent of each other ([16-17]), namely,

Var(T)=Var(S)+Var(N)E22

where

Var(T)=1MKi=1Mj=1K(Xi,jX¯¯)2,E23
var(S)=1Mi=1M(Xi¯X¯¯)2E24
Var(N)=1MKi=1Mj=1K(Xi,jX¯i)2.E25

Xi,jis the j-th member of the ensemble prediction starting from the i-thinitial condition. The Kis the ensemble size and Mis the total number of initial conditions (predictions); and X¯i=1Kj=1KXi,j,X¯¯=1Mi=1MX¯i.

Without the loss of generality, the climatological mean is assumed to be zero, thus (20) can be expressed by

σq2¯=μp2¯+σp2¯E26

where the overbar denotes the expectation over all predictions (initial conditions).Eq (14) and Eq. (21) can easily verify the property of MI, for example,

MI=RE¯=PI¯+12[μp2¯+σp2¯σq2¯σq2]=PI¯E27

Using (21), the information-based potential predictability measuresMI, (RE¯or PI¯) and PP¯can be rewritten as the function of the mean signal and noise, or their ratio SNR. Theσqand σpin (21) are actually σvand σv|iin (17), thus we have [18]

MI=12(lnσν2lnσν|i2)12(lnσν2lnσν|i2)=12ln(σν|i2σν2)=12ln(σp2¯σq2¯)=12ln(1STR)E28

The inequality in (23) is due to the fact that arithmetic mean is larger than or equal to geometric mean, or more strictly is a result of Jensen's inequality from information theory. Therefore, we have

ACp1exp(2MI),E29
MSSSp1exp(2MI).E30

The equalities in (24), (25) and (23) hold if and only if σν|i2is constant, as addressed in (18) and (19). The conditions that the forecast and climatological distribution are both Gaussian and the forecast variance σν|i2is constant are equivalent to the condition that iand νare joint normally distributed [13,18]. If iand νare joint normally distributed, the probability distributionsp(i),p(ν)and p(ν|i)are all Gaussian distributions and there are [13,18-19]

μν|i=ρ0σνσi(ii),E31
σν|i2=(1ρ02)σν2=constant,E32
MI=12ln(1ρ02),E33

where(26) is obtained using a linear regression with ρ0being the linear correlation between the initial state iand the future stateν. As mentioned earlier, MI measures the statistical dependence between iand ν. As can be seen from (26), the statistical dependence reduces to a linear correlation ρ0if the two variables are joint normally distributed. Because conditional mean μν|iis a linear function of initial state i(see (26)), ρ0is also the linear correlation between μν|iand ν, which is the potential anomaly correlation skill ACp. Note that if iand νare not joint normally distributed, ρ0is usually different fromACp.

One interesting question arises here, namely that, how we understand the MI-SNR discrepancy when there is significant variability of prediction variance, as expressed in (24) and (25)? As discussed earlier, the MI-based potential predictability measures the statistical dependence, liner or nonlinear, between the ensemble mean prediction μν|iand the hypothetical observationν(an arbitrary ensemble member), whereas the SNR-based potential skill only measures their linear correlation. When μν|iand νare joint normally distributed, their statistical dependence reduces to linear correlation. When μν|iand νare not joint normally distributed, MI naturally disagrees with SNR. The joint normally distributed variables have constant conditional variance. Note that prediction variance is also the conditional variance of νgiven the ensemble meanμν|i. Thus, if the prediction variance is varied, μν|iand νare definitely not joint normally distributed, making the SNR-based potential skill, a linear correlation between μν|iand ν, underestimate the nonlinear statistical dependence between μν|i( or i) and ν, which is a strict statistical definition of potential predictability.

It should be noted that the above conclusion should not be challenged by a possible fact that SNR-based skill might have a better relationship to actual skill than MI-based skill, simply because the actual skill is often measured by the linear correlation (or related quantity), which is inherent to the SNR-based skill. Thus, a more challenging issue is how to design new metrics to measure actual forecast skill which could appreciate the MI-based extra predictive information beyond SNR. In principle, the MI between ensemble mean prediction μν|iand actual observation Ocould have the potential capability to quantify the MI-based potential predictability [15]. However, how to effectively estimate MI in this context is not an easy issue.

In summary, there are connections between information-based potential predictability and SNR-based potential predictability, as built by the above equations. In other words, all the averaged information-based potential predictability measures are better than SNR-based predictability in characterizing ‘true’ potential predictability. When the climatology and prediction distribution are both Gaussian and the prediction variances are constant, the information-based measure is equivalent to the SNR-based potential measure.

## 3. Maximum SNR and PrCA

The signal and noise are theoretically statistically irrelevant when the ensemble size is infinite. However, the ensemble size is always finite in reality, thus the estimation of the signal is often contaminated by the noise. An optimal estimate for the largest potential predictability should be to maximize the SNR, from which the resultant signal component is the most predictable.

We denote by Sand Nsignal and noise of variable X, where Sand Nare matrixes of a two-dimension field describing temporal and spatial variation of the signal and noise of one variable of interest, namely, this section is at the framework of the multivariate statistics.

where, S=X¯iX¯¯; N=Xi,jX¯i

Xi,jis the j-th member of the ensemble prediction starting from the i-thinitial condition. The Kis the ensemble size and Mis the total number of initial conditions (predictions); and X¯i=1Kj=1KXi,j,X¯¯=1Mi=1MX¯i.

Our goal is to look for a vector q, which can maximize the ratio of the variance of signal and noise that are projected onto the vector, namely,

rS=qT*S;  rN=qT*NE34
SNR = σ2rSσ2rN     ===>     maxE35

where

σrS2=1MrS*rST=1MqTS*STq=qTΣSq σrN2=1KMrN*rNT=1KMqTN*NTq=qTΣNqE36

Mathematically such an optimization by (29) leads to a generalized eigenvalue-eigenvector problem based on the Rayleigh Quotient theorem[10,20]

qTΣS=λqTΣNE37

Where, ΣSis the covariance matrix of signal, ΣNis the covariance matrix of noise. The solution of (30) can be obtained by solving the below eigenvalue equation

qTΣSΣN1=λqTE38

Thus, the analysis of the largest potential predictability is also called the maximum signal-to-noise EOF (MSN EOF) analysis, first introduced by Allen and Smith [21] to estimate the signal optimally by suppressing the influences of noise, and widely used already in climate predictability study [20,22-23].

Practically, the number of grid points is always much larger than the number of total samples in climate studies, thus usuallyΣNdoesn’t have full-rank, leading to a solution of ill-conditioned inversions. There are two common methods to solve this issue, as introduced below.

### 3.1. The SNR is optimized in a truncated EOF space.

Denote by eiT(i=1,2…,k) the EOF modes

eiis a matrix of m*k, where m is the number of spatial grids and k is the number of the truncated modes. EOF could be employed using signal, or noise matrix or corresponding data matrix.

, and the signal and noise components projected on them are

Ts=eiT*SE39
TN=eiT*NE40
TSandTNare PC components with dimension of k*n where the nis the time samples. Thus, the signal and noise variance used in (31) should be calculated by ΣS=TS*TSTΣN=TN*TNT, respectively. If ktruncated modes remain, where kis usually much smaller than the number of spatial grids, the signal and noise covariance matrix is a full-rank of kmatrix. Thus, eq. (31) can be easily solved and the vector q (denoted asqeof)is called filter pattern, which is a k-element vector, the filter pattern on the truncated EOF space. The leading predictable component is
rS=qeofT*TSE41

If projecting qeofback to data space, we have

PrC_data=qT*S=qeofT*eT*S =qeofT*TSE42

Eq. (33) and (34) are identical each other, i.e., PrC is invariant with respect to a linear transformation.

The qis the filter pattern, rather than the most predictable pattern. The most predictable pattern vcan be obtained using the regression method, i.e.,

N=rN*V      E43
V=N*rTN/(rN*rNT)=1MKN*rTN=1MKN*TNT*q=1MKN*NT*e*q=ΣN*e*q                          E44

Also it can be written by

S=rS*VE45
V=S*rST/rS*rST =1MλS*rST=1MλS*ST*e*q=1λΣS*e*q=ΣN*e*qE46

### 3.2. Solving eq (29) using whitening approach

The approach is to whiten the noise variance (i.e., the denominator), making ΣNan identity matrix and whitening the covariance matrix of signal(ΣS) simultaneously. Thus, eq. (29) becomes

SNR=qTΣSqqTΣNq  =q'TΣWSq'q'Tq'maximumE47

Based on the matrix theory, the SNR in eq. (37) reaches maximum when q’is the eigenvector ofΣWS, the whitened signal covariance associated with the whitening noise ΣN. The q’is a modified qby a whitening factor. The algorithm is briefly summarized as follows:

1. Make the covariance matrix of noise (ΣN) identity, namely,

D1/2ETΣNED1/2=IE48

DandEare the eigenvalue and eigenvector matrices of ΣN. ED1/2is the transformation matrix that makes the covariance matrix of noise (ΣN) identity.

1. Whiten the signal covariance matrix by the transformation matrixED1/2, using the kleading modes

ΣWS=D1/2ETΣSED1/2E49
1. The SNR of (37) reaches maximum when q’is the leading eigenvector of the whitened signal covariance matrix ΣWS(in descent order). It is easy to see the relationship between qand q’

q'TΣWSq'=q'TD1/2ETΣSED1/2q'=qTΣSqE50

Thus,

q=ED1/2q'E51
1. After the filter pattern qis known, the most predictable component is easy to derive as shown in method 1, namely projecting the signal (ensemble mean) on the filter patterns

PrCs_s=qTSPrCs_n=qTNPrCs_t=qTTE52

The most predictable component is the one corresponding to the largest signal-to-noise ratio. All PrCs are temporally orthogonal (uncorrelated) with each other. It is noted that (41) is a little different from (33) or (34) where the PCs of truncated EOF spaces are used. It is due to a different truncation procedure in the two methods. In this first method, the truncation is applied before optimization whereas in the second method, the truncation is implicitly integrated into the whitening process. However both should be equivalent, which can be seen by another expression of (41)

PrCs_s=qTS=q'TD1/2ETS=q'TD1/2TSE53
1. Obtain the corresponding predictable patternsVby,

V=N*PrCs_nT/PrCs_n*PrCs_nT=1MKN*PrCs_n=1MKN*NNT*q=ΣNq  E54

A reconstructed forecast based on PrCS leading modes can be obtained by

N^=V*PrCs_nS^=V*PrCs_sX^=V*PrCs_TE55

X^only remains the leading PrCS modes and removes noise components, thus it can be expected to have a better skill than simple ensemble mean.

The variance explained by a PrCA mode can be obtained using (42). If all modes are remained in (42), the reconstructed filed should explain 100% of original field. We rewrite (42), applied into signal and noise, thus,

Σ=S1MST*S;   Σ^S=1MV*rST*rS*VT=VΛVT=j=1vj*λ*jvjTΣN=1MKNT*N;  Σ^N=1MKV*rNT*rN*VT=V*VT=jvj*vjTE56

Where the Σ^is the estimated variance using PrCA modes. Thus, the variance explained by a specific mode measured in the original space, and the truncated space, is respectively as below:

relative to signal :  λjvj*vjtr(ΣS)T;   λjvj*vjTtr(Σ^S)relative to noise:  vj*vjTtr(ΣN)  ;  vj*vjTtr(Σ^N)  relative to total variance: λjvj*vjT+vj*vjTtr(ΣS)+tr(ΣN)  ;  λjvj*vjT+vj*vjTtr(Σ^S)+tr(Σ^N)E57

## 4. Maximizing PI and PrCA

Another interpretation to MSN EOF is its connection with information-based measure PI or PP defined in (11) - (15). For example, as argued in [10], the predictive power PP is a positively orientated predictive index, defined by the difference between posterior (prediction) entropy and prior (climatology) entropy, thus measuring the decrease of uncertainties due to prediction.

The PrCA analysis is an approach to maximize PI, or maximum PP, equivalent to minimizing σp2/σq2if the prediction variance is little changed, to derive the most predictable component. The σq2is climatology variance, often referred as to the total variance Var (T), which is composed of the signal variance and noise variance. Under the ‘perfect model’ assumption, the noise variance equals to the forecast error variance [24], namely,

σq2=Var(S)+Var(N)=Var(S)+σp2E58

Thus, the minimization of σp2/σq2is equal to the maximization of 1σp2/σq2, i.e., STR, which is equivalent to the maximization of SNR, i.e., MSN EOF. In some literature, the term of MSN EOF and PrCA are alternatively used due to their complete equivalence. Actually, both the MSN EOF and PrCA methods belong to the discriminant analyses because the two methods, though from different perspectives, can be understood to seek a best linear combination of variables that separates the signal and the noise as much as possible [13]. The both methods identify the “filter pattern”, or weight matrix, providing an optimized filter to discriminate the signal and noise, where the time series reflects the temporal evolution of the dominant mode of the signal, and the spatial pattern characterizes the spatial distribution of the dominant mode of signal, which are respectively referred to as spatial pattern, or the most predictable pattern.

It should be noted that the equivalence of SNR-based and information-based PrCA approach is based on the condition that the climatology and forecast distribution are both Gaussian.It is apparent since the PI and PP cannot be only expressed by the form of prediction and climatology variance as (11) – (15) under non-Gaussian assumption. It is difficult to derive the optimization solution for PI or PP from their general definitions of (8) and (9).

A remark to the algorithm of Schneider and Griffies [10] is a technical issue. In Schneider and Griffies [10], the PrCA is proposed to derive by minimizing PP, i.e., minimizingσp2/σq2, leading to the below eigenvalue equation:

fTΣNΣT1=ξ fTE59

whereΣTis the total variance. The optimal filter resulting in the most predictable is the eigenvector fwith the smallest eigenvalue ξ. Comparing (44) with (31) reveals that eigenvalue ξand qare reciprocal,indicating the equivalence of PrCA using maximization of (31) and minimization of (44). Usually, the eigenvector with the smallest eigenvalue often lacks of a stable, large scale-like pattern, making the approach of (44) impractical in real application. The truncated EOF space, which is used in solving (31) and (44), can greatly reduce this concern but still the most predictable pattern contains some noise. Thus, the MSN EOF approach, introduced above, is a better option.

## 5. A practical application – Potential predictability of climate change projection in AR5

In this section, we will explore the uncertainty of climate change projection using the above theoretical framework. The estimation of uncertainty is based on the Coupled Model Intercomparison Project Phase 5 (CMIP5), a new set of climate model experiments involved in the Intergovernmental Panel on Climate Change (IPCC) Fifth Assessment Report (AR5). The CMIP5 is promoted to address some crucial issues on climate modeling and future climate state. More than 20 climate models were employed in this project with main focus on: 1) evaluate model predictability of future climate on different time scales (near term (out to about 2035) and long term (out to 2100 and beyond)), 2) understanding key mechanisms responsible for differences in model projections, and 3) quantify some important feedbacks of climate system like clouds and carbon cycle.

One of experiments used in CMIP5 is the Representative Concentration Pathways (RCPs) scenario. All model experiments involved in this scenario are forced by four kinds of mixing greenhouse gases (GHGs) boundary conditions which will finally lead increasing of radiation by 2.6, 4.5, 6.0 and 8.0 watt per square at the end of 21 century.

In this chapter, we will use the sea surface temperature (SST) projection of scenario R60 (the increasing of 6.0 watt per square experiment) to evaluate the potential predictability of climate projection of the scenario R60. At present, only nine models collected in R60 are available to download (From ESG-PCMDI Gataway), as summarized in table 1.

 Model Country Ocean Model Resolutions Projection CCSM4-version16 USA (NCAR) 60 levels ; 1.0 lon. x 0.5 lat. 2051-2100 CSIRO-MK3.6.0 Australia (BMRC) 30 levels; 1.875 lon x 0.9375lat. 2051-2100 GISS-E2-R USA (NASA) 32 levels;1 lon x 1.25 lat. 2051-2100 GFDL-ESM2M USA (GFDL) 50 levels; 1- 1/3 lon. x 1 lat. 2051-2100 HadGEM2-ES UK (Hadley Center) 40 levels;1-1/3 lon. x 1 lat. 2051-2100 CM5A-LR France (IPSL) ORACA2 resolution in OPA 2051-2100 MIROC5-Coco 4.5 Japan Varied resolution 2051-2100 MRI-CGCM3 Japan Varied resolution 2051-2100 NorESM1-M Norway Varied resolution 2051-2100

### Table 1.

Models used for evaluation

The SST outputs from these models are all monthly averaged data. For the purpose of the study of the climate change, we use annual mean in the following discussions. Because the lack of uniformity of ensemble member, only one member is used for each model here. In this study, we confine the domain to the Pacific across 60S to 60N.

Shown in Fig. 1 and Fig 2a are the spatial pattern and time series of the first EOF (Empirical Orthogonal Function) for the Pacific Ocean from 2051-2100. As can be seen in Fig.2a, the Pacific SST has a striking increase, with the strongest response to the forcing of GHGs in the tropical Pacific along the equator as shown in Fig.1. In the extra-tropical beyond the 30S and 30N, the increase in SST is relatively weaker. On average, the mean temperature of the Pacific ocean of 60S to 60N increases around 0.5 to 1C from 2051-2010 in these models, as shown in Fig. 2b, the evolution of the mean temperature over the Pacific ocean. The mean of multiple models has the increase rate of around 0.75C as shown by the red line in Fig. 2b.

Fig. 2 shows a visible divergence of projections among models, suggesting uncertainties existed in the responses of these models to the GHGs forcing. It should be noted that little divergence in Fig 2a is due to the normalization, a post-processing just for a good-looking of this figure.

It is of great interest to explore the uncertainty of the above projections. As introduced aforementioned, one can use the above information-based framework to measure the uncertainty of climate prediction, given the multiple ensembles available. Apparently, there are several challenges here: 1) there is only one-member projection for each model, lacking sufficient ensembles; 2) the projection is not dependent on initial condition, thus any measures based on multiple initial conditions are invalid here; 3) the climatological distribution used in estimating the uncertainty may be uncertain under the background of global warming. For the first issue, we propose to solve it using multiple model strategy, i.e. pool all model projections to construct a 9-member ensemble. Under the framework of potential predictability, the model is assumed to be perfect. Thus the disparities among these model projections can be viewed as the ensembles of a perfect model, perturbed by initial conditions or other parameters. For the second issue, we assume that the projection is a long-term prediction at a given initial condition. The distribution for the average of multiple model projections is used as climatological distribution here.

Displayed in Fig. 3 are the variations of projection utility RE during the projection time from 2051 to 2010.The climatological mean and variance are estimated from all ‘ensemble’ members and years (sample size is 50*9) as in [25]. The projection mean and variance are estimated each year from the 9-member ensemble. As can be seen, it is apparent that the utility R continues to decrease until around 2070 and then bounce after 2080. For the projection during 2070 - 2080, RE is small. When projection (prediction) and climatology distributions are identical, the relative entropy R is zero from (14). In theory, a nonzero value of R indicates predictability. However, in practice, a finite sample size introduces sampling errors that lead to a nonzero R even though there is no extra information supplied by the prediction. Therefore the statistical significance level should exceed the extent of uncertainty due to the finite sample size. We quantify the extent of uncertainty using a Monte Carlo method as in [26]. A sample with 9 members is randomly drawn from the climatology distribution and its relative entropy R is computed with respect to the climatology distribution. This process is repeated 10 000 times, and the value above 95% of 10 000 RE is considered to be the significant level as shown in Fig. 3 (dashed line). As can be seen, the projections between around 2070 and 2080 have statistically ‘zero’ relative entropy, and the other projections beyond this period have significant relative entropy.

A striking feature of RE in Fig. 3 is its U-shape variation with the projection time (i.e., the time step of integration), which is quite different from actual ensemble forecast at time scales from days to seasons. Typically, the RE monotonously decreases with the lead time of prediction at the time scales from days to seasons (e.g., [1-2, 11]), i.e., the predictability decreases with lead time. The monotonous variation of RE with lead time of predictions well characterizes the nature and attributes of realistic atmospheric and oceanic system, which is chaotic and stochastic, leading to the information at initial conditions gradually dissipated with lead time. Apparently it is not this case here, since the projection is not an initial value problem, and mainly is a response to external forcing (e.g., CO2).

One possible explanation for this U-shape is related to the climatological distribution used here. We used the average of multiple model projections that have an apparent trend as the climatology distribution. If the RE is dominated by the ensemble mean (ensemble mean square) and the contribution of ensemble spread is relatively much smaller, the RE can show such a U-shape structure. Another plausible explanation is based on a hypothesis, namely, the climatology from multiple models is close to the true value. Under this assumption, the projection with small RE in figure 3 has high fidelity and vice versa. Here, we use the RE to measure the difference between the distribution of projection and the designed distribution, which has been also used in previous studies [17]. However such a hypothesis may cause concerns. One may argue to use present climatological distribution as a reference distribution in the above discussions. However, it can be expected that the climatology of the scenario of R60 should be quite different from the present one. Thus, a further study on the reference distribution is highly demanded in estimating uncertainty of climate change projection.

## 6. Conclusion

In this chapter, the SNR-based and information-based measures of potential predictability were introduced. They include the signal to noise ratio (SNR) and two measures of information-based predictability. One is relative entropy (RE) that measures individual potential predictability whereas the other is mutual information (MI), the average of RE over all initial conditions, which measures the average potential predictability. From statistical derivation and theoretical analysis, we have below conclusions:

1. The SNR usually measures the average predictability with the assumption that signal inherent to slowly varying external forcing is predictable and the noise is unpredictable;

2. A new measure of prediction utility that is derived from information theory is introduced. It measures the additional information provided by a prediction (p) over that already available from the climatological or reference distribution (q). One natural measure is their relative entropy REdefined as the relative difference of entropy between pand q. For the case of Gaussian distributed pand q, the RE can be expressed in terms of the prediction and reference means and covariance.

3. Averaged RE over all initial conditions, called the mutual information (MI), a measure of the statistical dependence of the forecast state and the initial (boundary) conditions, measure the averaged predictability. The MI-based metrics can measure more potential prediction utility than the SNR-based counterpart. The MI-based predictability measures the statistical dependence, linear or nonlinear, between ensemble mean (prediction) and an ensemble member (hypothetical observation), whereas the SNR-based predictability only measures a linear relationship between prediction and hypothetical observation.

4. When the prediction and climatological distribution are Gaussian and the ensemble spread is constant with predictions, both measures are identical to each other. When the ensemble spread is not constant, the SNR-based predictability often underestimates the potential predictability.

5. The predictable component analysis (PrCA), a method that assesses the most predictable patterns, is introduced. The PrCA decomposes the predictability into patterns accounting for different fractions of the total predictability. Distinguishing spatial structures that are unpredictable from those that are predictable is important for practical prediction problems, particularly when the predictable patterns are few in number.

As an example, the uncertainty of the climate change projection from scenario R60 of AR5 was evaluated, with the Pacific SST as the target. Nine models from different countries were participated in this evaluation. It was found that the most striking warming occurs at the tropical Pacific along the equator. In the extra-tropics beyond 30S to 30N, the increase in SST is relatively weaker. On average, the mean temperature of the Pacific ocean of 60S to 60N increases around 0.5 to 1C from 2051-2010 in these models. The relative entropy RE, measuring the utility of climate projection, continues to decrease until around 2070 and then bounce after 2080. For the projection during 2070 - 2080, RE is small. Under the assumption that the climatology from multiple models is close to the true value, the projection during the period with small RE suggests high fidelity and vice versa.

### Acknowledgement

This work is supported by grants from Canadian NSERC Discovery Program and Chinese NSF 41276029 D0601. This work is also supported by grants from the National Basic Research Program (2013CB430300), the National Science Foundation (91128204) and the State Oceanic Administration (201105018, 151053) of China.

## Notes

• ei is a matrix of m*k, where m is the number of spatial grids and k is the number of the truncated modes. EOF could be employed using signal, or noise matrix or corresponding data matrix.

chapter PDF
Citations in RIS format
Citations in bibtex format

## More

© 2013 The Author(s). Licensee IntechOpen. This chapter is distributed under the terms of the Creative Commons Attribution 3.0 License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

## How to cite and reference

### Cite this chapter Copy to clipboard

Youmin Tang, Dake Chen, Dejian Yang and Tao Lian (January 16th 2013). Methods of Estimating Uncertainty of Climate Prediction and Climate Change Projection, Climate Change - Realities, Impacts Over Ice Cap, Sea Level and Risks, Bharat Raj Singh, IntechOpen, DOI: 10.5772/54810. Available from:

### chapter statistics

3Crossref citations

### Related Content

Next chapter

By S.C. Nwanya

First chapter

#### A New Perspective for Labeling the Carbon Footprint Against Climate Change

By Juan Cagiao Villar, Sebastián Labella Hidalgo, Adolfo Carballo Penela and Breixo Gómez Meijide

We are IntechOpen, the world's leading publisher of Open Access books. Built by scientists, for scientists. Our readership spans scientists, professors, researchers, librarians, and students, as well as business professionals. We share our knowledge and peer-reveiwed research papers with libraries, scientific and engineering societies, and also work with corporate R&D departments and government entities.