Empirical Orthogonal Functions (EOF) homework.

Background reading: Hsieh book chapter 2 - out of order with Fourier analysis (his Ch. 3))



1. First, a question about the sense of EOF's:

You have some data(x,t) with space-time structure: 144 space bins (in this case, just longitude), by 240 time bins (months).
You want to decompose it into a set of orthogonal terms that add together to give the total.
Since they are orthogonal, each term represents some variance: cross terms disappear when you average the square of the sum.
If you keep enough terms you will get back all the variance (and more importantly, you can reconstruct the data in all its detail).

In the case of EOF (also known as Principal Components (PC)) analysis, you express your data as:

  1. How many values (numbers) are in your input data array? The data has 144 longitudinal points and 240 months in time
  2. How many values (numbers) are needed to build each term on the left? You need 144 values for the EOF (normal modes) and 240 values for the PCs (expansion coefficients).
  3. If 5 EOFs capture most of the data's variance, how much smaller (in the above sense) is the EOFxPC representation compared to the full data set? It is (144*240*144)/(144*240*5)=144/5 = 28.8 times smaller.

2. Read in your field1 (let's call it x again). Use the same data from HW3 data source here.


Perform and display an EOF analysis of your first field.

q2_rev.gif

q2_lopez.gif

Extra credit/ teach us something new:

  1. Do an EOF analysis of your second field. Display and interpret. Compare and contrast with your first field.
  2. Remove the mean, or don't, or remove a different mean. How are the results affected? Explain the sense of the results.
  3. Standardize the time series at each longitude: this gives eigenvectors and eigenvalues of the correlation matrix rather than the covariance matrix (recall HW3 where you plotted slices of these).
  4. Try doing the computation with x and t transposed. Now the "coefficients" or "eigenvectors" are in time (240) and the "scores" are in space (144).
    1. there is a part of the variance that EOF's can't reach if you remove the TIME mean, but then use SPACE as the statistical dimension. What is that part of the variance? Just look at the difference between the input data and the reconstruction and you will see what I mean.
  5. Do a "Combined EOF" analysis of a vector that combines the two fields (each field must be standardized, since the units are different).
    1. you just make a (240x288) array where the 288 values at each time are the 144 field1 (standardized) and then the 144 field2.
    2. Run princomp() in the usual way
    3. Unpack the results at plotting time: the first 144 values are your field1, the others your field2. Rescale with physical units for a better plot.
    4. CEOFs here maximized the variance of the combined data, so they indicate related variations between the 2 fields.

HOW TO DO SVD OF COMBINED FIELDS

Here, my second field is OLR. I did not follow the extra credit question, but I did something similar:
If you want to do EOF analysis on two fields, then Singular Value Decomposition or SVD is a way you can do. In fact EOF analysis on a single field is a special case of this.
First, you have to variables, let say U and OLR. You need to construct the cross-correlation matrix of these, (you can also do cross-covariance but since units are different, you may want to normalize both data matrices). This cross-correlation matrix do not have to be squared. Then you do the SVD on it.
Example:
Uwnd = [n by m] matrix of field one (normalized)
OLR = [n by p] matrix of field two (normalized)
Note that they share a similar dimension, which in this case is time, so n=240, also for this homework m = p = 144.

Correlation matrix is Corr = transpose(Uwnd) * OLR = [m by p] matrix
Now let's do [U, S, V] = svd(Corr) MATLAB
here U = [m by m] matrix of e-vectors for Uwnd (or left field)
V = [p by p] matrix of e-vectors for OLR (or right field)
S = [m by p] matrix of e-values for the decomposition. The diagonals are the covariance explained by each mode.

You can get the PCs by :: Uwnd_PC = Uwnd * U and OLR_PC = OLR * V

svd_olr.gif

svd_trunc1-3.gif
svd_trunc4-full.gif