Lecture 23: PCA

Raguvir Kunani and Isaac Schmidt, Summer 2021

Let's import our data and see what we have.

The first thing we need to do is center our data. We could standardize as well, but as each column is on roughly the same scale, we will not do so here.

Midterm Exam and Final Exam

Let's plot our data.

Let's calculate the covariance matrix for these two columns. Notice how $X^T X$ returned the same matrix as np.cov.

Now, let's determine the eigenvalues and eigenvalues of this matrix. We'll use np.linalg.eigh, which is a faster implementation than np.linalg.eig for symmetric matrices (which covariance matrices always are).

Now, we can plot the eigenvectores, scaled by their relative eigenvalues. Note that we've scaled up both eigenvectors by the same constant, so they are more readable on the plot.