Identification of Intraseasonal Modes of Variability Using Rotated Principal Components

Documenting mid-tropospheric global-scale circulation is important to climate modelers. Models are applied to capture the most basic statistics of the flow (e.g., annual and season means and variability). As model physics improve, the goal is to extend the accuracy of the models to shorter, more societally relevant time scales (e.g., monthly average flow). Within a month or season, recurrent modes are present. Comparing such observed flows to modeled counterparts can provide an arbiter of success. There is a long history is isolating such patterns in meteorology using teleconnections (correlation patterns). Some of the initial work by Namias (1980) was used to create atlases of teleconnections at every gridpoint. This approach has the merit of completeness; however, the redundancy in patterns from adjacent gridpoints is inefficient, given present day computational power. An extension of the Namias approach to selected base points provided extensive documentation of Northern Hemisphere winter teleconnection patterns of sea level pressure and 500 mb height (Wallace & Gutzler, 1981).


Introduction
Documenting mid-tropospheric global-scale circulation is important to climate modelers. Models are applied to capture the most basic statistics of the flow (e.g., annual and season means and variability). As model physics improve, the goal is to extend the accuracy of the models to shorter, more societally relevant time scales (e.g., monthly average flow). Within a month or season, recurrent modes are present. Comparing such observed flows to modeled counterparts can provide an arbiter of success. There is a long history is isolating such patterns in meteorology using teleconnections (correlation patterns). Some of the initial work by Namias (1980) was used to create atlases of teleconnections at every gridpoint. This approach has the merit of completeness; however, the redundancy in patterns from adjacent gridpoints is inefficient, given present day computational power. An extension of the Namias approach to selected base points provided extensive documentation of Northern Hemisphere winter teleconnection patterns of sea level pressure and 500 mb height (Wallace & Gutzler, 1981).
A second approach to identifying these variability modes uses eigenvectors to filter the correlation structures into recurrent patterns that are localized. To document the characteristic flow patterns and their time dependent statistics, an objective methodology is applied to portray the localization of the centers of action in each mode. Often, rotated principal component analysis (RPCA) is employed to decompose the aforementioned correlation structure of the flow to obtain information on the morphology of the flow patterns, their associated time behavior and the variability of the total flow associated with each pattern. The eigenvectors are scaled to create principal components that are linearly transformed or rotated to exploit the local structure within the domain, identifying physically meaningful circulation patterns. Barnston & Livezey (1987; hereafter referred to as BL87) present an extensive catalogue of mid-tropospheric patterns using this approach. Their work is somewhat limited by use of a specific rotation (Varimax) that enforces an orthogonal transformation matrix from the principal components to their rotated counterparts. Additionally, they did not test rigorously the number of eigenmodes, selecting ten modes for each analysis. Selecting too few eigenmodes can result in multiple patterns being merged on each eigenvector retained. Moreover, if too many patterns are selected, the circulation modes can be fragmented.
In this work, we extend the eigenvector-based approach by relaxing the orthogonality constraint and test each analysis for the optimal number of eigenvectors to retain and rotate. While BL87 used cutting-edge analyses for the 1980's, the limited availability of data (35 years) and computational power (limiting the number of gridpoints to 358) are suboptimal by current standards. Innovation in the analysis procedure in recent years combined with newer data sets, the availability of 63 years of data and much denser grids to document the climate system are strong motivators to re-examine Northern Hemisphere geopotential modes of variability.

Data and methodology 2.1 Description of the dataset
To conduct a study on hemispheric teleconnectivity, it is critical to have high-resolution, global gridded data with a long period of record. The NCEP/NCAR reanalysis (NNRP -Kalnay et al., 1996) dataset, which is formulated on a 2.5° x 2.5° latitude-longitude grid, includes over 200 meteorological variables at up to 17 vertical levels at 6 hour intervals from 1948 -present. Variables in the NNRP were rated by Kalnay et al. (1996) based on the influence of observations, model derived data, and climatology in their formulation. As this study is concerned with mid-tropospheric intraseasonal modes of variability, 500 hPa Northern Hemispheric geopotential heights from the NNRP (rated most reliable by Kalnay et al., 1996) were used in the analysis.
To maintain consistency with the methodology of BL87, monthly averages of 500 hPa heights were formulated from 63 years of the NNRP . The domain of interest used by BL87 did not include height data south of 20°N latitude, primarily owing to the lack of variability in the tropics in mid-level height data. To maintain consistency, the same 20°N threshold for the southern-most latitude line was used in the present study. Fig. 1a shows the study domain and the grid spacing associated with the NNRP.
The NNRP are provided on a latitude-longitude grid; hence, gridpoints will converge as the domain extends poleward. This convergence causes artificial inflation of correlation values at higher latitudes. To mitigate this issue, the data were analysed objectively using a Barnes analysis (Barnes, 1964) to place the data on a Fibonacci grid (Swinbank & Purser, 2006) that, by definition, has equal grid spacing. Such a grid will not result in artificial inflation of the correlations between the 2303 gridpoints (Fig. 1b). To ensure no data extrapolation at lower latitudes, the spacing associated with the Fibonacci grid was kept identical to that at the equator. Interpolation error of the Barnes analysis was calculated at less than 1%. The Fibonacci grid defines the domain and is used in the computation of the correlation matrix.

RPCA methodology
Variation of geopotential height is a function of latitude. To avoid biasing the analyses and to permit smaller, but equally important, variation in the southern regions of the domain to be represented equally, a correlation matrix (R), rather than variance-covariance matrix, was computed. Using the correlation matrix standardizes the data to a mean of zero and standard deviation of one. Hence, all the subsequent analyses are standardized anomalies (Z) from the mean. The correlation matrix was formed among the grid points by summing over the 63 year sample for January or July, providing a representative month in the cold or warm season, respectively. The correlation matrix for January (July) was decomposed into a square matrix of eigenvectors (V) and associated diagonal matrix of eigenvalues (D), given by the equation The rank of the eigenvector matrix is equal to the smaller of the number of gridpoints or number of observations minus 1. Because there were 63 observations, only 62 eigenvalues were nonzero and 62 eigenvectors were extracted. The goal of this stage of the analysis is to create a set of basis vectors that compress the original variability in R into a new reference frame. It is possible to plot the elements of each vector (V) on spatial maps; however, the patterns in V do not result in any localization of the spatial variance, nor do they represent well the variability in R (Richman, 1986). The eigenvectors were scaled by the square root of the corresponding eigenvalue to create principal component loadings (A). Doing so permits the data to be expressed as where the vectors in F represent the new set of basis functions, known as principal component scores and A is the matrix of weights that relates the original standardized data (Z) to F. The vectors in A contain elements that are regression coefficients between Z and F.
Many of the 62 dimensions represent small-scale signals (sub-planetary scale) that have variance properties indistinguishable from noise, associated with very small eigenvalues. We truncate the number of principal components to represent only that variance associated with planetary scale wavetrains. To accomplish this goal, a two-step process is applied. First, the magnitudes of the eigenvalues are examined and those with relatively large eigenvalues are retained to yield a subset of l principal component loading vectors. The value l is selected by implementing the scree test (Wilks, 2011) to provide a visual estimate of the approximate number of non-degenerate eigenvectors to retain. At this stage, a number of roots, l, is selected to be liberal, intentionally representing more than the ideal number of roots, k, associated with the large-scale signal. To assess the coherent signal, the vectors of A are linearly transformed to a new set of vectors, B, known as rotated principal component loadings, which simplify or localize the hemispheric wavetrains to agree with the correlation structure of the data. This process can be summarized by the equation where T is an invertible transformation matrix that represents a rotation of the reference frame into a position that results in the maximal simplification that the data permit. The rotation algorithm used in this analysis is Promax (Richman, 1986). Promax PC scores allow for correlations between the vectors in F. The goal of the rotation is to identify, in the vectors of B, height anomaly patterns that recur often during the month of January (July). The spatial properties of each vector in B are a function of the number of rotated PC vectors retained (l). Therefore, it is critical to select the optimal number of vectors, k. To accomplish that goal, the matrix A is transformed to B for a variable number of PC vectors retains (i.e., 2 to l). Each solution yields a different set of patterns. We desire a set k < l that capture as much coherent large scale signal as possible that matches the patterns embedded within R.
The one set of k PC loadings that relates best to the correlation matrix generating them is determined and the number of PCs retained is set to k. The process, outlined in Richman & Lamb (1985), and refined in Richman (1986), selects each vector of B and identifies the location or gridpoint with the highest absolute PC loading and matches the rotated PC loading vector to the corresponding correlation vector using the congruence coefficent (Richman, 1986). The corresponding correlation vector in R is a teleconnection pattern. Hence, this method incorporates the logic of the traditional teleconnection approach, providing an objective procedure to selecting k. For January, the Promax solution with the optimum match occurred as k = 8 and for July, k = 4. The average absolute congruence match was found to be 0.904 Promax in January and 0.895 for Promax in July.
Owing to the linear decomposition of R, signs of the coefficients, in each vector of B can be multiplied by -1 with no loss of interpretation. Because positive values exceeding ~ +0.25 correspond to ridges (Richman & Gong, 1999), negative values of ~ -0.25 correspond to troughs, multiplication by -1 reverses the interpretation of the troughs and ridges. Furthermore, the sign of the PC score is multiplied by the sign of the PC loading to obtain a physical interpretation of any monthly map (Compagnucci & Richman, 2008). For example, a negative anomaly in a PC loading map multiplied by a negative PC score gives the interpretation of a positive height anomaly.

January
For the January analysis, 8 principal components are retained and rotated. The patterns represent a decomposition of the flow into modes of variability. The total variability extracted is 70.0 percent.

January mode descriptions
Pattern 1 -The West Pacific / North Pacific Oscillation (WP/NPO; Fig. 2) is characterized by a center near the Kamchatka Peninsula at 60°N, 160°E and an oppositely signed east-west Pacific anomaly band at 30°N, centered on the dateline. This oscillation accounts for 10.3% of the total January 500 hPa variance. The congruence coefficient (Richman, 1986) between the RPC loading vector and the point teleconnection pattern that has a basepoint location coinciding with the largest magnitude RPC loading (hereafter referred to as the pattern/teleconnection congruence) is -0.91, suggesting a close match between the RPC and correlation structure. Wallace & Gutzler (1981) identified this pattern in the 500 hPa geopotential heights during boreal winter. They found the thickness pattern consistent with a cold-core equivalent barotropic structure. BL87 found WP/NPO in a 700 hPa analysis of 1950-84 monthly winter height anomalies. Hsu & Wallace (1985) investigated the temporal structure of WP variability at subseasonal time scales, relating the anomalies to Rossby wave dispersion. The NPO has been associated with Alaskan blocking events (Renwick & Wallace, 1996) and modulation of the Pacific storm track (Lau 1988;Rogers 1990). Linkin & Nigam (2008) claim the WP pattern is a basic analog to the North Atlantic Oscillation and has impact on the weather in the Pacific Northwest, especially in coastal regions, in the south-central Great Plains, and on marginal sea ice zones in the Arctic. Our analysis of the pattern time series (Fig. 3a) suggests no significant trend and no year to year persistence in the autocorrelation function (ACF; Fig. 3b) However, the year to year January variability in the WP/NPO (Fig 3a) shows decadal nonstationarity as there exist several periods where the mean is significantly different from zero (Table 1) and the variance structure undergoes dramatic rapid year to year variability in the first three decades of the analysis with lower frequency variability in the past three decades (Table 2).
Pattern 2 -Subtropical zonal winter pattern (SZW; Fig. 4) has a unique morphology with one major anomaly centered over western China (34°N, 90°E). Nearly all regions south of 35°N have negative RPC loadings, suggesting a zonal pattern in the subtropics. There are a number of secondary centers of the same sign extending to the west across northern Africa and in the southwestern US. The pattern accounts for 9.4 percent of the total 500 hPa variance in January. The pattern/teleconnection congruence is -0.92, a close match between the RPC and correlation structure. The time series of PC scores for SZW is nonstationary with a strong inverse linear trend that implies the sign of the anomalies has reversed over the 63 year period (Fig. 5a). There is a sharp discontinuity around 1980. The statistical significance of the linear trend line (Fig. 5a) is p=6x10 -6 , which supports the idea of strong height rises in the subtropical regions. The ACF (Fig. 5b) has a unique pattern with multiyear persistence, with 5 of the first 8 lags significant, owing to the trend in subtropical heights. The decadal analysis of the mean RPC scores strengthens the idea of a reversal in anomaly pattern as the scores are positive in the first three decades and negative in the latter three (Table 1). The decadal variance is below the overall mean of 1 in every decade, suggesting that this pattern persists substantially from year to year (Table 2).
Pattern 3 -The Northern Asian pattern (NA; Fig. 6) has its main center of action close to the North Pole at 80°N and 40°E. Secondary centers of opposite sign are situated over Mongolia at 50°N 100°-120°E and just west of the United Kingdom. Esbensen (1984) identified a similar pattern, although the main center was shifted south of the position shown herein. This mode explains 8.3% of the January height variance. The pattern/teleconnection congruence is 0.95, representing a very close match to the traditional teleconnection pattern. The time series (Fig. 7a) has no discernible trends but has periods of high variance in the 1960's and the 1990's -2000's ( Table 2). The ACF (Fig. 7b) shows no year to year persistence.
Pattern 4 -The eastern Atlantic teleconnection (AE; Fig. 8), accounting for 6.5% of the variance, has a dipole at ~ 40°W with one center at longitude 30°W and a second elongated www.intechopen.com  center of opposite sign with maxima at 30°E, with a strong gradient in PC loadings. The pattern in Europe is elongated and extends southeast over North Africa and the subtropical western Atlantic Ocean with a secondary center at 30°N, 0°. Lau (1988) discusses how this pattern modulates Atlantic storm tracks in Europe through modifying the strength of the westerlies. BL87 identify a version of the AE pattern with more of a north-south dipole. The pattern/teleconnection congruence of AE is 0.81, representing a fairly close match to the traditional teleconnection pattern. The time series (Fig. 9a) has a strong positive trend (p=0.00002). The decadal mean analysis (Table 1) indicates nonstationarity with mostly negative scores during the 1950's-1960's, then near-zero means for the next twenty years and positive values after. The variance of the time series (Table 2) shows periodic fluctuations with low variability in the 1990's. Analysis of the persistence of the time series for the AE (Fig. 9b) indicates possible slight lag 1 autocorrelation although, by chance, one year in twenty is expected to exceed the white noise level. 1948-1957 1958-1967 1968-1977 1978-1987 1988-1997 1998-2007 1948-1957 1958-1967 1968-1977 1978-1987 1988-1997 1998-2007  Pattern 5 -The North Atlantic Oscillation (NAO ; Fig. 10) is a well-documented pattern exhibiting a dipole centered over Iceland and the Azores, explaining 9.3% of the January variance. The center in the subtropics extends well into the United States with a secondary maxima over the southern Plains. A weaker center appears in central Russia near latitude 60°N, 80°E. This pattern is known to control the strength of the westerlies (BL87). The pattern/teleconnection congruence is -0.91, representing a close match to the traditional teleconnection pattern. The time series (Fig. 11a) (Fig. 11b).

RPC Mean
Pattern 6 -The Pacific North American pattern (PNA; Fig. 12), accounting for 8.0% of the variance, is another well-established teleconnection (Esbensen, 1984;Hsu & Wallace, 1985;BL87). The structure is defined by four centers of action. In the RPC loadings, the tropical center is located at 20°N, 165°W and another of opposite sign north at 50°N 175°E. This center has the largest magnitude PC loadings in excess of 0.8. The North American centers are at 50°N, 120°W and an elongated area anchored at 30°N 90°W. This Rossby wavetrain is controlled by the strength of the westerlies (BL87). The pattern/teleconnection congruence is -0.90, representing a close match to the teleconnection pattern. The time series (Fig. 13a) is nonstationary with a significant negative trend (p=0.05), arising largely from the positive values in the first decade of the analysis and near-zero or negative means in the last three decades (Table 1). The variance of the time series (Table 2) indicates aperiodic fluctuations in the variance, most noteworthy during 1948-1957. The time series of the PNA sugest no persistence (Fig. 13b).
Pattern 7 -The Eurasian pattern (EA; Fig. 14) is a high latitude patterns with four centers of action. In the RPC loadings, the primary centers are over the North Sea (58°N, 4°E) and northern Kazakhstan (50°N, 60°E). A secondary center with slightly lower magnitude RPC loadings exists over northeast China (44°N, 125°E) with another lower magnitude center over northern Canada's Northwestern Passages (75°N, 120°W). EA explains 11% of the winter height variance. The pattern is similar to the BL87 Eurasian pattern type 2. The pattern/teleconnection congruence is -0.95, representing a very close match to the traditional teleconnection pattern. The time series (Fig. 15a) has a marginally significant negative trend (p=0.10) arising from the near-zero to positive values through around 1990, followed by more negative values (Table 1). The variance of the time series (Table 2) indicates nonstationarity with alternating bursts of variability on a decadal time scale. The ACF of the time series for the EA (Fig. 15b) indicates very minor persistence in lag 1 and another negative value outside the white noise spectrum in year 12. It is likely that the latter is a statistical artifact. www.intechopen.com Atmospheric Model Applications 288 mode shown herein is nearly identical to the TNH pattern identified in BL87. Our pattern to teleconnection congruence is -0.88, representing a close match to the traditional teleconnection pattern. The time series (Fig. 17a) has a slight negative trend that is not statistically significant. Analysis of decadal means (Table 1) indicates one time period (1968)(1969)(1970)(1971)(1972)(1973)(1974)(1975)(1976)(1977) that had a noteworthy positive value. The variance of the time series (Table 2) is nonstationarity with higher variability prior to 1988 and decreasing variability in the following two decades. The ACF of the time series (Fig. 17b) contains no significant autocorrelations at short lags and two values that barely exceed the white noise bandwidth beyond lag 10, likely statistical artifacts.

Intercorrelation of the time series for the January modes
One advantage of applying an oblique rotational algorithm to the height data is that it provides additional information on the degree of correlation among the modes. This information is shown in Table 3. Although most off-diagonal values are near-zero, SZW's time series was slightly positively correlated with the time series for the PNA. Similarly, the NAO time series was positively correlated with the Eurasian pattern. This was the largest magnitude correlation found (+.32). The time series for the PNA was weakly negatively related to the Eurasian pattern and weakly positively related to that of the TNH.  Table 3. Correlations of the January RPC time series.

July
The July analysis has 4 eigenmodes retained. The RPCs represent a decomposition of the flow into modes of variability extracting 46.8 percent of the monthly height variance.

July mode descriptions
Pattern 1 -The Subtropical Zonal Summer pattern (SZS; Fig. 18) portrays a zonal ring of one sign extending from the south edge of the domain north to approximately 40°N around the Northern Hemisphere. The pattern/teleconnection congruence is 0.98, indicating the PC is nearly identical to the teleconnection mode. Furthermore, the SZS extracts a very large amount of variance explained (24.9%) compared to any other mode (winter or summer) in our analyses. The time series (Fig. 19a) reveals the main signal is a strong trend of increasing heights that is highly statistically significant (p=8.2x10 -12 ). The ACF (Fig. 19b)   the relentless increase in subtropical summer heights with significant lag correlations to twelve years. The time series for the SZS mode (Fig. 19a) and the winter subtropical zonal mode (Fig. 5a) correlate at r=+0.60 (not shown), suggesting seasonal persistence. The SZW and SZS may be manifestations of the same process when the seasonal migration of the ITCZ is considered. Analysis of the decadal means (Table 4) show nonstationarity with thirty years of highly positive values followed by 30 years of highly negative values. As the RPC loadings in the tropics are negative, the negative means correspond to above normal geopotential heights. The variance statistics (Table 5) indicate extremely low variability in every decade, strengthening the argument that the trend, which accounts for 53% of the time series variance, is the major feature of this pattern.
Pattern 2 -The North Pacific mode (NP; Fig. 20) is a complicated wave number 4 pattern, with both north-south and east-west dipoles, that accounts for 7.3% of the variance. The north Pacific dipole has a double center at 59°N, 150°E and 39°N, 158°W. North of that pair of centers is a second set with opposite sign (82°N, 170°E and 51°N, 130°W). There are many other localized positive and negative maxima, most notably over Ontario, Canada, off the mid-Atlantic coast, over the central Atlantic, over the Mediterranean Sea, Greenland, Russia and northern India. The pattern/teleconnection congruence is 0.86, a large value considering the complexity of the pattern. The time series (Fig. 21a) indicates the main signal is a very slight upward trend (not statistically significant). The NP ACF (Fig. 21b) is unremarkable. Analysis of the decadal means (Table 4) shows values near zero except during the 1998-2007 decade where there was a large positive mean. The variance statistics (Table 5)  Pattern 3 -The North American-European mode (NAE; Fig. 22) is a Great Circle pattern with centers over Ontario, Canada (45°N, 80°W), north Greenland (82°N, 45°W), the United Kingdom (52°N, 0°) and the Mediterranean Sea (35°N, 20°E) and describes 6.8% of the variance. The pattern to teleconnection congruence is 0.83, suggesting a fair match between the RPC and the teleconnection pattern. The time series (Fig. 23a) has virtually no trend with low variance in the middle of the data period. The ACF (Fig. 23b) shows no year to year persistence. Decadal means of the NAE (Table 4) were close to zero for all decades. In contrast, the variance was low in the late 1960's to the late 1980's, flanked by periods of much higher variance in a 20 year high, 20 year low, 20 year high sequence.
Pattern 4 -The East Asian-North American pattern (EANA; Fig. 24) is wave number 4 pattern extracting 7.8% of the July variance. There is an elongated center extending from eastern China into the central Pacific Ocean, near 40°N latitude. The pattern becomes more chaotic over North America with two centers at high latitude (~ 75°N) across Canada to south of Iceland. Additionally, there is a center over northern California with one of opposite sign over Illinois, and a third with another sign change in the northwest Atlantic, off the coast of New England. The pattern/teleconnection congruence is -0.91, conveying a very good match with the parent correlations. The time series (Fig. 25a) indicates a significant downward trend over the full period (p=0.03); however, close examination reveals an upward trend from 1948-1966 (piecewise regression trend in that period had p=0.002), little trend in the years 1967-1990 followed by a strong downward trend from 1991-2010 (piecewise regression tend p=.01). The ACF (Fig. 25b) shows no year to year persistence. Analysis of the decadal means (Table 4) indicates nonstationarity with a highly negative mean from 1998-2007. Additionally, during that period, the variance statistics (Table 5) 1948-19571958-19671968-19771978-19871988-19971998-2007 1948-1957 1958-1967 1968-1977 1978-1987 1988-1997 1998-2007 Table 5. Same as Table 2, but for July.

Intercorrelation of the time series for the July modes
After computing oblique PC score correlations (  Table 6. Same as Table 3, but for July.

Conclusion
RPCA was used to filter the monthly 500 hPa geopotential heights in January and July for 1948 -2010 to establish the dominant modes of variability over that period. For January, 8 modes, accounting for 70% of the total variance, were extracted. The majority of the patterns that emerged in these analyses had been identified in other studies, published mainly in the 1980's. However, several new patterns emerged in both months and the variance extracted by each pattern showed major shifts when the recent height data were included. RPCs that highlighted subtropical height anomalies were found to be more important when post-1980's data were included and statistically significant linear trends emerged on 4 of the 8 winter patterns with one additional marginally significant pattern. Moreover, the variability of all patterns had marked decadal variation. The July analyses were even more striking as two of the four modes has significant trends in their RPC scores. The leading mode for summer, SZS, extracted a massive 24.9% of the total variance. That pattern was zonally www.intechopen.com Identification of Intraseasonal Modes of Variability Using Rotated Principal Components 295 symmetric with the largest negative RPC loadings in the subtropics. The time series had an exceedingly large downward trend, with a p=8.2X10 -12 , indicating large height rises in the subtropics. As this mode had a pattern/teleconnection congruence of -0.98, it is thought to be very robust and not an artifact of the multivariate analysis. Three other patterns were found for the July analysis and, together, the four account for 46.8% of the 500 hPa variability. Taken collectively, these findings suggest that the dominant 500 hPa modes are robust but nonstationary.
It is hoped that climate modelers will decompose their model outputs for 1948-2010 atmospheres into RPC modes and compare those model modes to the ones documented herein. Differences in the model modes, time series and persistence statistics from the observed modes document the details of how model climatologies differ. Inspection of the specific disparities may help to identify correctable problems in such models.

Acknowledgment
A portion of Michael Richman's time was supported by the Cooperative Institute for Mesoscale Meteorological Studies. Andrew Mercer's time was supported in part by the Northern Gulf Institute.