A Reproduction Study: The Kernel PCA Interpretation of Self-Attention Fails Under Scrutiny
By: Karahan Sarıtaş, Çağatay Yıldız
Potential Business Impact:
Shows self-attention doesn't work like a math trick.
In this reproduction study, we revisit recent claims that self-attention implements kernel principal component analysis (KPCA) (Teo et al., 2024), positing that (i) value vectors $V$ capture the eigenvectors of the Gram matrix of the keys, and (ii) that self-attention projects queries onto the principal component axes of the key matrix $K$ in a feature space. Our analysis reveals three critical inconsistencies: (1) No alignment exists between learned self-attention value vectors and what is proposed in the KPCA perspective, with average similarity metrics (optimal cosine similarity $\leq 0.32$, linear CKA (Centered Kernel Alignment) $\leq 0.11$, kernel CKA $\leq 0.32$) indicating negligible correspondence; (2) Reported decreases in reconstruction loss $J_\text{proj}$, arguably justifying the claim that the self-attention minimizes the projection error of KPCA, are misinterpreted, as the quantities involved differ by orders of magnitude ($\sim\!10^3$); (3) Gram matrix eigenvalue statistics, introduced to justify that $V$ captures the eigenvector of the gram matrix, are irreproducible without undocumented implementation-specific adjustments. Across 10 transformer architectures, we conclude that the KPCA interpretation of self-attention lacks empirical support.
Similar Papers
Self-attention vector output similarities reveal how machines pay attention
Computation and Language
Helps computers understand sentences by focusing on key words.
Rethinking PCA Through Duality
Machine Learning (CS)
Finds hidden patterns in data more easily.
Gaussian Equivalence for Self-Attention: Asymptotic Spectral Analysis of Attention Matrix
Machine Learning (Stat)
Makes AI understand words better by analyzing their connections.