1 min readJul 13, 2019
Reza, thank you for taking the time to read my article and respond! To address the specific points you make:
- You are correct in that the “new” independent variables would be identical to the “old” independent variables in the (incredibly uncommon) case that our independent variables are perfectly independent of one another. You’re right that the covariance matrix is already diagonal! That being said, this is an exceptionally uncommon occurrence when dealing with real-world data that isn’t specifically created from something like, say, an experiment with an orthogonal design.
- I do reference the dependent variable Y in that we can often use our principal components as new features in a regression model where we regress Y on X (or Z). However, the mention of Y shouldn’t affect the actual process of PCA, because PCA is done only on the independent variables (as an unsupervised learning technique). Y is only mentioned here because we often use PCA as a pre-processing step before applying some type of supervised learning where the Y is relevant.