Assignment

=Empirical Orthogonal Functions (EOF) homework.=

1. First, a question about the sense of EOF's:
You have some data(x,t) with space-time structure: 144 space bins (in this case, just longitude), by 240 time bins (months). You want to decompose it into a set of orthogonal terms that add together to give the total. Since they are orthogonal, each term represents some variance: cross terms disappear when you average the square of the sum. If you keep enough terms you will get back all the variance (and more importantly, you can reconstruct the data in all its detail).

In the case of EOF (also known as Principal Components (PC)) analysis, you express your data as: math {EOF}_1(\bold x) {PC}_1(t) + {EOF}_2(\bold x) {PC}_2(t) + {EOF}_3(\bold x) {PC}_3(t) + ... = mydata(x,t) math
 * 1) How many values (numbers) are in your input data array?
 * 2) How many values (numbers) are needed to build each term on the left?
 * 3) If 5 EOFs capture most of the data's variance, how much smaller (in the above sense) is the EOFxPC representation compared to the full data set?

2. Read in your field1 (let's call it x again). Use the same data from HW3 data source here.
>
 * Perform and display an EOF analysis of your first field.**
 * IDL: here is an example code [|HW5_EOF_BEM.pro]
 * IDL results are at the bottom of the page
 * Matlab: x should be (x,t) or (240,144), but in the file it is (144 x 240), so you need to transpose it.
 * Then call [COEFF, SCORE, latent, tsquare] = princomp(x);
 * COEFF is a 144 x 144 array of coefficients ("projection" or "loadings" or "weights" in the x domain)
 * SCORE is a 240 x 144 array of the PC(240) associated with each EOF(144).

code format="matlab" %%% Matlab: here are the key lines. Reconstruction is the test of whether you have things ordered right. %%% If you can reconstruct the data, you can zero out some of the modes (in a copy of either SCORE or COEFF) %%% and reconstruct just one mode, or the sum of the first N modes, or whatever.

x = double(ncload('/Users/bem/Downloads/hw3_data.nc','olr')); NLONS = 144;

% EOF analysis only deals with variations (because it works on the covariance matrix). % Temporal covariance that is. So remove the time mean of each longitude.

xp = x.*0; % make array of correct size for x perturbations (pert. from time mean) for ix = 1:NLONS xp(ix,:) = x(ix,:) - mean(x(ix,:)); end

xp = transpose(xp); [COEFF,SCORE,latent,tsquare] = princomp( xp ); reconXp = SCORE*transpose(COEFF); diff = xp-reconXp;

figure(1) contourf(xp) figure(2) contourf(reconXp) figure(3) contourf(diff)

explained = latent/sum(latent) .*100; explained(1:10)

% truncate to 1st EOF only to show pure 1st mode variability SCOREtrunc = SCORE; SCOREtrunc(:,2:end) = 0; % zero out modes 2:144 reconXtrunc = SCOREtrunc*transpose(COEFF); figure(4) contourf(reconXtrunc) %%% result below as test: looks good

code

Extra credit/ teach us something new:
The double orthogonality condition: math \int EOF_i(\vec x) EOF_j(\vec x) d \vec x = 0
 * 1) Do an EOF analysis of your second field. Display and interpret. Compare and contrast with your first field.
 * 2) Remove the mean, or don't, or remove a different mean. How are the results affected? Explain the sense of the results.
 * 3) Standardize the time series at each longitude: this gives eigenvectors and eigenvalues of the //correlation// matrix rather than the //covariance// matrix (recall HW3 where you plotted slices of these).
 * 4) Try doing the computation with x and t transposed. Now the "coefficients" or "eigenvectors" are in time (240) and the "scores" are in space (144).
 * 5) There is a part of the total spacetime variance that EOF's can't reach if you remove the TIME mean, but then use SPACE as the statistical dimension over which you sum to compute covariances. (Or, for that matter, if you remove the SPACE mean to define anomalies but then perform a TIME covariance analysis). What is that unreachable part of the spacetime variance? (Just look at the difference between the input data and the reconstruction and you will see what I am getting at.)
 * 6) Do a "Combined EOF" analysis of a vector that combines the two fields (each field must be standardized, since the units are different).
 * 7) you just make a (240x288) array where the 288 values at each time are the 144 field1 (standardized) and then the 144 field2.
 * 8) Run princomp in the usual way
 * 9) Unpack the results at plotting time: the first 144 values are your field1, the others your field2. Rescale with physical units for a better plot.
 * 10) **CEOFs here maximized the variance of the //combined// data, so they indicate //related variations// between the 2 fields.**
 * examples: [|CEOF.slp.uwnd.ps] [| CEOF.sst.slp.ps] [|CEOF.sst.precip.ps] from code [|HW5_CEOF_BEM.pro]
 * 1) Read rotatefactors documentation (Matlab) and learn and teach us all about "rotated EOFs".
 * 2) Background: you first truncate to the first few EOF/PC pairs, then relax one or more of the orthogonality conditions.
 * 3) (background of background: EOFs are orthogonal in space, AND PCs are orthogonal in time. Heavy constraint! Either one would allow us to still speak of variance as being cleanly partitioned among modes. Rotated modes may or might or could (?) thus be more physical, since a purely mathematical constraint has been relaxed).

and

\int PC_i(t) PC_j(t) dt = 0 math

IDL results for OLR field
[|HW5_EOF_BEM.ps] [|IDL code HW5_EOF_BEM.pro]

Showing that an EOF is an eigenvector of the covariance matrix
(so that the results here are related back to those of Question 6.1 of HW3).

Recall that an //eigenvector// of a matrix is a vector that, when multiplied by the matrix, returns the same vector (times a constant, the eigenvalue): math \matrix M \vec x = \lambda \vec x math