TL;DR
This study critically examines the claim that self-attention in transformers functions as kernel PCA, finding no empirical evidence to support this interpretation across multiple architectures.
Contribution
It provides a rigorous reproduction and refutation of prior claims, highlighting inconsistencies and lack of empirical support for the KPCA interpretation of self-attention.
Findings
Negligible similarity between value vectors and KPCA eigenvectors
Misinterpretation of projection loss decreases
Inability to reproduce eigenvalue statistics without undocumented adjustments
Abstract
In this reproduction study, we revisit recent claims that self-attention implements kernel principal component analysis (KPCA) (Teo et al., 2024), positing that (i) value vectors capture the eigenvectors of the Gram matrix of the keys, and (ii) that self-attention projects queries onto the principal component axes of the key matrix in a feature space. Our analysis reveals three critical inconsistencies: (1) No alignment exists between learned self-attention value vectors and what is proposed in the KPCA perspective, with average similarity metrics (optimal cosine similarity , linear CKA (Centered Kernel Alignment) , kernel CKA ) indicating negligible correspondence; (2) Reported decreases in reconstruction loss , arguably justifying the claim that the self-attention minimizes the projection error of KPCA, are misinterpreted, as the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
