Principal Component Analysis and biplots. A Back-to-Basics Comparison of Implementations
Ettore Settanni

TL;DR
This paper compares how PCA and biplots are implemented in base-R and contributed R packages, revealing discrepancies caused by computational choices and emphasizing the importance of understanding underlying structures.
Contribution
It provides a detailed, implementation-agnostic comparison of PCA and biplots, highlighting common pitfalls and discrepancies across different R packages.
Findings
Implementation differences can lead to unexpected discrepancies in PCA and biplot outputs.
Relationships based on theoretical assumptions often do not hold in practice due to computational choices.
A comprehensive evaluation grid can help identify and understand these discrepancies.
Abstract
Principal Component Analysis and biplots are so well-established and readily implemented that it is just too tempting to give for granted their internal workings. In this note I get back to basics in comparing how PCA and biplots are implemented in base-R and contributed R packages, leveraging an implementation-agnostic understanding of the computational structure of each technique. I do so with a view to illustrating discrepancies that users might find elusive, as these arise from seemingly innocuous computational choices made under the hood. The proposed evaluation grid elevates aspects that are usually disregarded, including relationships that should hold if the computational rationale underpinning each technique is followed correctly. Strikingly, what is expected from these equivalences rarely follows without caveats from the output of specific implementations alone.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
