You have some data(x,t) with space-time structure: 144 space bins (in this case, just longitude), by 240 time bins (months).
You want to decompose it into a set of orthogonal terms that add together to give the total.
Since they are orthogonal, each term represents some variance: cross terms disappear when you average the square of the sum.
If you keep enough terms you will get back all the variance (and more importantly, you can reconstruct the data in all its detail).
In the case of EOF (also known as Principal Components (PC)) analysis, you express your data as:
How many values (numbers) are in your input data array? 144(lon)*240(mon)
How many values (numbers) are needed to build each term on the left? 144 for EOF1(x); 240 for PC1(t)
If 5 EOFs capture most of the data's variance, how much smaller (in the above sense) is the EOFxPC representation compared to the full data set? 144*240-(144+240)*5
2. Read in your field1 (let's call it x again). Use the same data from HW3 data source here.
Perform and display an EOF analysis of your first field.
IDL: here is an example code [[file/view/HW5_EOF_BEM.pro|HW5_EOF_BEM.pro]]
IDL results are at the bottom of the page
Matlab: x should be (x,t) or (240,144), but in the file it is (144 x 240), so you need to transpose it.
Then call [COEFF, SCORE, latent, tsquare] = princomp(x);
COEFF is a 144 x 144 array of coefficients ("projection" or "loadings" or "weights" in the x domain)
SCORE is a 240 x 144 array of the PC(240) associated with each EOF(144).
Extra credit/ teach us something new:
Do an EOF analysis of your second field. Display and interpret. Compare and contrast with your first field.
The first several EOF analysis of SST explained much more variability than Precip. This indicates that there’s more noise for the precip data. For the first mode, they both represent ENSO, but the strongest signal appears more east for the SST.
Remove the mean, or don't, or remove a different mean. How are the results affected? Explain the sense of the results.
It doesn’t seem to affect the results.
Standardize the time series at each longitude: this gives eigenvectors and eigenvalues of the correlation matrix rather than the covariance matrix (recall HW3 where you plotted slices of these).
This change slightly reduces the variability explained by the first several EOF modes. For the first mode, the maximum signal is slightly shifted eastward.
Try doing the computation with x and t transposed. Now the "coefficients" or "eigenvectors" are in time (240) and the "scores" are in space (144).
There is a part of the total spacetime variance that EOF's can't reach if you remove the TIME mean, but then use SPACE as the statistical dimension over which you sum to compute covariances. (Or, for that matter, if you remove the SPACE mean to define anomalies but then perform a TIME covariance analysis). What is that unreachable part of the spacetime variance? (Just look at the difference between the input data and the reconstruction and you will see what I am getting at.)
EOF with x and t transposed:
EOF with spatial mean removed.
1. First, a question about the sense of EOF's:
You have some data(x,t) with space-time structure: 144 space bins (in this case, just longitude), by 240 time bins (months).You want to decompose it into a set of orthogonal terms that add together to give the total.
Since they are orthogonal, each term represents some variance: cross terms disappear when you average the square of the sum.
If you keep enough terms you will get back all the variance (and more importantly, you can reconstruct the data in all its detail).
In the case of EOF (also known as Principal Components (PC)) analysis, you express your data as:
2. Read in your field1 (let's call it x again). Use the same data from HW3 data source here.
Perform and display an EOF analysis of your first field.
Extra credit/ teach us something new:
The first several EOF analysis of SST explained much more variability than Precip. This indicates that there’s more noise for the precip data. For the first mode, they both represent ENSO, but the strongest signal appears more east for the SST.
It doesn’t seem to affect the results.
This change slightly reduces the variability explained by the first several EOF modes. For the first mode, the maximum signal is slightly shifted eastward.
- Try doing the computation with x and t transposed. Now the "coefficients" or "eigenvectors" are in time (240) and the "scores" are in space (144).
- There is a part of the total spacetime variance that EOF's can't reach if you remove the TIME mean, but then use SPACE as the statistical dimension over which you sum to compute covariances. (Or, for that matter, if you remove the SPACE mean to define anomalies but then perform a TIME covariance analysis). What is that unreachable part of the spacetime variance? (Just look at the difference between the input data and the reconstruction and you will see what I am getting at.)
EOF with x and t transposed:EOF with spatial mean removed.