Homework #4: Maximum Covariance Analysis (MCA), Empirical Orthogonal Functions (EOFs), Principal Component Analysis (PCA), Factor Analysis, Singular Value Decomposition (SVD) of covariance matrices.
A lot of names for basically one or two things.
My two variables: precipitation, outgoing longwave radiation
Import data, and subtract off a composite seasonal cycle (which also removes the long term mean). Display the raw data sections.
Look at the precipitation covariance matrix with itself:
Let's diagonalize C_preipPrecip, i.e. find its eigenvectors V and eigenvalues d (diagonals in matrix D) in Matlab's notation V,D. Display the results.
Which is EOF1? The one with the largest eigenvalue (they are ordered from smallest to largest in Matlab so it is 144). What is the nature of EOF1? Best shown visually through its 144x144 OUTER PRODUCT: (EOF1 * EOF1').
Let's unpack that a little more for clarity:
What is the next EOF? The next checkerboard approx. to resid.
% What if we project the data onto EOF1? We get PC1.
What if we multiply using the outer product again? We get a 1-mode reconstruction of the original data.
Two mode reconstruction is better.
Notice they are orthogonal (their product sums to zero) in BOTH Space and Time (and so obviously in spacetime).
Can we do something similar to a covariance matrix that isn't symmetric? Eigenvectors are complex for a nonsymmetric matrix, that isn't quite it. Instead we can find the sets of column vectors U and V ("singular vectors") that make the best checkerboard fit to cross-covariance C_precipOlr.
The SVD operation.
Build the leading column vectors' checkerboard, lon x lon in dimension.
Let's unpack that a little more for clarity:
Reconstruction of individual fields from left SV's U (for precipitation) and right SV's V (for outgoing longwave radiation) is just like from EOFs.
Are there areas where EOF1 has the sign of covariance wrong? Yes! The sign of small-magnitude regions in E1E1 can't be trusted...
Are there areas where SVD1 has the sign of covariance wrong? Yes! The sign of small-magnitude regions in U1V1 can't be trusted... Compute cov(T') instead of cov(T). This is the matrix of SPATIAL covariances at each TIME (T*T' /144., a 240x240 matrix). Display it, does it make much sense? No, it does not make much sense. And yet it is useful... Show that its first 144 eigenvalues are the same as the 144 eigenvalues of the normal matrix of TEMPORAL covariances. The importance is this: when climatologists have many spatial points (many thousands of grid cells all over the world), but far fewer time samples (12 months x 50 years = 600 times, say), the thousands x thousands covariance matrix is huge, and is underdetermined or rank-deficient. So you might as well do the problem the other way (600 x 600). Using the time x time sized matrix of SPATIAL covariances, for each of its 600 eigenvalues you get an eigenvector (which is now a 600-point time structure rather than a spatial EOF structure as in the normal case). You project that time structure onto the time series at each spatial grid cell, and voila! You recover the EOFs (spatial patterns). Show us please!
My codes:
% Roque Vinicio Cespedes% MPO 524 - HW #4% April 11, 2013closeall force;
clear;
clc;
% -----% Import data, and subtract off a composite seasonal cycle (which also removes the long term mean)load('Eq_timelon_sections.mat')
T = precip - repmat(squeeze(mean(reshape(precip,12,20,144),2)) ,20,1); %% precipitation anomalies from mean seasonal cycle
u = olr - repmat(squeeze(mean(reshape(olr,12,20,144),2)) ,20,1); %% outgoing longwave radiation " " "%Display the raw data sectionsfigure('Color',[111],'Position',[040500500]); colormap(redblue(255));
subplot(121); imagesc(lon,year,T); colorbar; caxis([-11] *max(abs(T(:)))); title('precipitation anomalies')subplot(122); imagesc(lon,year,u); colorbar; caxis([-11] *max(abs(u(:)))); title('outgoing longwave radiation anomalies')% Look at the precipitation covariance matrix with itself: cov(T) = T'*T/239.
C_TT = T'*T /240.; % 239 is what cov() uses, that old N-1 issue
bigvalue = max(abs(C_TT(:)));
figure('Color',[111],'Position',[040500500]); colormap(redblue(255));
imagesc(lon,lon,C_TT); colorbar; caxis([-11] *bigvalue); title('Covariance matrix precipitation with precipitation')% Let's diagonalize C_TT, i.e. find its eigenvectors V and eigenvalues d (diagonals in matrix D) in Matlab's notation V,D:[V,D] = eig(C_TT);
d = diag(D);
% display the resultsfigure('Color',[111],'Position',[040500500]); colormap(redblue(255));
subplot(211); plot(diag(D)); title('eigenvalues, a measure of magnitude')subplot(212); imagesc(V); caxis([-11] *0.3); title('columns are eigenvectors, aka EOFs')% Which is EOF1? The one with the largest eigenvalue (they are ordered from smallest to largest in Matlab so it is 144)
EOF1 = V(:,144); d1 = d(144);
% What is the nature of EOF1? Best shown visually through its 144x144 OUTER PRODUCT: (EOF1 * EOF1')
E1E1 = EOF1*EOF1' *d1;
figure('Color',[111],'Position',[040500500]); colormap(redblue(255));
imagesc(E1E1); caxis([-11] *bigvalue); title('E1E1: the Best checkerboard approximation to Cov(precip,precip)'); colorbar% Let's unpack that a little more for clarity:figure('Color',[111],'Position',[040500500]); colormap(redblue(255));
E1E1 = EOF1*EOF1' *d1;
resid = C_TT - E1E1;
imagesc([C_TT, E1E1, resid]); colorbar; caxis([-11] *bigvalue)title('C, outer product EOF1#EOF1 *d1, and the difference C - E1E1*d1')line([144,144]*1 , [0,144])%%% lines to separate partsline([144,144]*2 , [0,144])% -----% What is the next EOF? The next checkerboard approx. to resid
EOF2 = V(:,143);
E2E2 = EOF2 * EOF2' *d(143);
residmax = max(abs(resid(:)));
resid2 = resid-E2E2;
figure('Color',[111],'Position',[040500500]); colormap(redblue(255));
imagesc([resid, E2E2, resid-E2E2]); colorbar; caxis([-11] *residmax)title('resid, outer product EOF2#EOF2*d2, and the difference')line([144,144]*1 , [0,144])%%% lines to separate partsline([144,144]*2 , [0,144])% What if we project the data onto EOF1? We get PC1.
PC1 = T*EOF1; PC2 = T*EOF2;
figure('Color',[111],'Position',[040500500]); colormap(redblue(255));
subplot(211)plot(lon, EOF1); title('Structure of EOF1(x) in longitude')subplot(212)plot(year, PC1); title('PC1(t): The Projection of precipitation anomalies onto EOF1 at each time')% What if we multiply using the outer product again? We get a 1-mode reconstruction of the original data
recon1 = PC1*EOF1'; % a truncated version of the data, reconstructed with just one modefigure('Color',[111],'Position',[040500500]); colormap(redblue(255));
imagesc(lon,year,[T, recon1]); colorbar; caxis([-11] *4);
title('precipitation anomalies and PC1(t)*EOF1(x) reconstruction')line([144,144] , [0,240])% -----% Two mode reconstruction is better
recon12 = recon1 + PC2*EOF2';
figure('Color',[111],'Position',[040500500]); colormap(redblue(255));
imagesc([lon,360+lon] ,year, [T, recon12]); colorbar; caxis([-11] *4);
title('precipitation anomalies and PC1(t)*EOF1(x) +PC2(t)*EOF2(x) reconstruction')line([360, 360] , [1980,2010])% Notice they are orthogonal (their product sums to zero) in BOTH Space and Time (and so obviously in spacetime).figure('Color',[111],'Position',[040500500]); colormap(redblue(255));
imagesc((PC1*EOF1') .* (PC2*EOF2')); colorbar; caxis([-11]);
title('Elementwise product of PC1(t)EOF1(x) and PC2(t)EOF2(x)')% -----% Can we do something similar to a covariance matrix that isn't symmetric?% Eigenvectors are complex for a nonsymmetric matrix, that isn't quite it.% Instead we can find the sets of column vectors U and V ("singular vectors")% that make the best checkerboard fit to cross-covariance C_Tu
C_Tu = T'*u /240.;
maxvalue = max(abs(C_Tu(:)));
figure('Color',[111],'Position',[040500500]); colormap(redblue(255));
imagesc(lon,lon,C_Tu); colorbar; caxis([-11] *maxvalue); title('Covariance matrix precipitation with outgoing longwave radiation')xlabel('olr longitude')ylabel('precipitation longitude')% The SVD operation[U,S,V] = svd(C_Tu);
s = diag(S); % the singular valuesfigure('Color',[111],'Position',[040500500]); colormap(redblue(255));
subplot(311)
imagesc(U); colorbar; caxis([-11] *max(abs(U(:))));
title('Left singular vectors U in [U,S,V]=svd(C_precipOlr): precipitation structures')subplot(312)plot(s); title('Singular values (magnitudes): returned in decreasing order this time')subplot(313)
imagesc(V); colorbar; caxis([-11] *max(abs(V(:))));
title('Right singular vectors V in [U,S,V]=svd(C_precipOlr): outgoing longwave radiation structures')% Build the leading column vectors' checkerboard, lon x lon in dimension
U1 = U(:,1);
V1 = V(:,1);
U1V1 = U1 * V1' *s(1);
figure('Color',[111],'Position',[040500500]); colormap(redblue(255));
imagesc(lon,lon,U1V1); colorbar; caxis([-11] *maxvalue);
title('U1V1 singular vector product, precipitation with olr')% -----% Let's unpack that a little more for clarity:
residTu = C_Tu - U1V1;
figure('Color',[111],'Position',[040500500]); colormap(redblue(255));
resid = C_Tu - U1V1;
imagesc([C_Tu, U1V1, residTu]); colorbar; caxis([-11] *bigvalue)title('Cov(precip,olr), SVD1 outer product U1#V1 *s1, and the residual')line([144,144]*1 , [0,144])%%% lines to separate partsline([144,144]*2 , [0,144])% -----% Reconstruction of individual fields from left SV's U (for precipitation) and right SV's V (for outgoing longwave radiation)% is just like from EOFs.
tsU = T*U1; tsV = u*V1;
reconT = tsU*U1'; % a truncated version of the precipitation data, reconstructed with just one SVD mode
reconu = tsV*V1'; % a truncated version of the olr data, reconstructed with just one SVD modefigure('Color',[111],'Position',[040500500]); colormap(redblue(255));
subplot(211)
imagesc([lon,360+lon] ,year, [T, reconT]); colorbar; caxis([-11] *4);
title('precipitation anomalies and ts(t)*SVD1(x) reconstruction')line([360, 360] , [1980,2010])subplot(212)
imagesc([lon,360+lon] ,year, [u, reconu]); colorbar; caxis([-11] *4);
title('olr anomalies and ts(t)*SVD1(x) reconstruction')line([360, 360] , [1980,2010])
% ----% Are there areas where EOF1 has the sign of covariance wrong?figure('Color',[111],'Position',[040500500]); imagesc(E1E1 .* C_TT); caxis([-1,1] *125); colorbar; colormap(redblue(255));
title('E1E1 .* CprcipPrecip');
% Are there areas where SVD1 has the sign of covariance wrong?figure('Color',[111],'Position',[040500500]); imagesc(U1V1 .* C_Tu); caxis([-1,1] *3500); colorbar; colormap(redblue(255));
title('U1V1 .* CprecipOlr');
% -----% Calculate cov(precip')
C_T_trap = T*T'/240.; % 239 is what cov() uses, that old N-1 issue
bigvalue_trap = max(abs(C_T_trap(:)));
figure('Color',[111],'Position',[040500500]); colormap(redblue(255));
imagesc(lon,lon,C_T_trap); colorbar; caxis([-11] *bigvalue_trap); title('cov(precip trap)');
eig_C_TT = eig(C_TT); eig_C_T_trap=eig(C_T_trap);
figure('Color',[111],'Position',[040500500]); hold('all'); plot(eig_C_TT,'r-','LineWidth',3);
plot(eig_C_T_trap(97:240),'k--','LineWidth',3); l1=legend('eig(cov(precip))','eig(cov(precip trap(97:240)))'); hold('off');
set(l1,'Location','NorthOutside');
% -----
figlist=findobj('type','figure');
fori=1:size(figlist,1)saveas(figlist(i),['HW4_fig'num2str(figlist(i))'.png']);
end
MPO 524 - HW #4
April 11, 2013
Homework #4: Maximum Covariance Analysis (MCA), Empirical Orthogonal Functions (EOFs), Principal Component Analysis (PCA), Factor Analysis, Singular Value Decomposition (SVD) of covariance matrices.
A lot of names for basically one or two things.
My two variables: precipitation, outgoing longwave radiation
Import data, and subtract off a composite seasonal cycle (which also removes the long term mean). Display the raw data sections.
Look at the precipitation covariance matrix with itself:
Let's diagonalize C_preipPrecip, i.e. find its eigenvectors V and eigenvalues d (diagonals in matrix D) in Matlab's notation V,D. Display the results.
Which is EOF1? The one with the largest eigenvalue (they are ordered from smallest to largest in Matlab so it is 144). What is the nature of EOF1? Best shown visually through its 144x144 OUTER PRODUCT: (EOF1 * EOF1').
Let's unpack that a little more for clarity:
What is the next EOF? The next checkerboard approx. to resid.
% What if we project the data onto EOF1? We get PC1.
What if we multiply using the outer product again? We get a 1-mode reconstruction of the original data.
Two mode reconstruction is better.
Notice they are orthogonal (their product sums to zero) in BOTH Space and Time (and so obviously in spacetime).
Can we do something similar to a covariance matrix that isn't symmetric? Eigenvectors are complex for a nonsymmetric matrix, that isn't quite it. Instead we can find the sets of column vectors U and V ("singular vectors") that make the best checkerboard fit to cross-covariance C_precipOlr.
The SVD operation.
Build the leading column vectors' checkerboard, lon x lon in dimension.
Let's unpack that a little more for clarity:
Reconstruction of individual fields from left SV's U (for precipitation) and right SV's V (for outgoing longwave radiation) is just like from EOFs.
Are there areas where EOF1 has the sign of covariance wrong? Yes! The sign of small-magnitude regions in E1E1 can't be trusted...
Are there areas where SVD1 has the sign of covariance wrong? Yes! The sign of small-magnitude regions in U1V1 can't be trusted...
Compute cov(T') instead of cov(T). This is the matrix of SPATIAL covariances at each TIME (T*T' /144., a 240x240 matrix). Display it, does it make much sense? No, it does not make much sense. And yet it is useful... Show that its first 144 eigenvalues are the same as the 144 eigenvalues of the normal matrix of TEMPORAL covariances. The importance is this: when climatologists have many spatial points (many thousands of grid cells all over the world), but far fewer time samples (12 months x 50 years = 600 times, say), the thousands x thousands covariance matrix is huge, and is underdetermined or rank-deficient. So you might as well do the problem the other way (600 x 600). Using the time x time sized matrix of SPATIAL covariances, for each of its 600 eigenvalues you get an eigenvector (which is now a 600-point time structure rather than a spatial EOF structure as in the normal case). You project that time structure onto the time series at each spatial grid cell, and voila! You recover the EOFs (spatial patterns). Show us please!
My codes: