PRIM-cipal components analysis
Tianhao Liu, Daniel Andr\'es D\'iaz-Pach\'on, J. Sunil Rao

TL;DR
This paper explores unsupervised NFLTs for elliptical distributions, revealing two optimal, opposite bump-hunting strategies based on principal components, and demonstrates their practical implications on Fashion-MNIST.
Contribution
It introduces a novel NFLT for unsupervised bump-hunting using principal components, with algorithms that optimize variance or volume, supported by empirical testing.
Findings
Peeling smallest principal components captures multiplicity.
Peeling largest principal components isolates popular styles.
Theoretical results motivate new bump-hunting algorithms.
Abstract
Supervised No Free Lunch Theorems (NFLTs) are well studied, yet unsupervised NFLTs remain underexplored. For elliptical distributions, we prove that there exist two equally optimal, scientifically meaningful bump-hunting strategies that are exact opposites, with no universal winner. Specifically, peeling orthogonal dimensions from (), retaining an inter-quantile region of probability per peeled dimension, maximizes total variance and Frobenius norm when the smallest principal components (called pettiest components) are selected, and minimizes them when the selected dimensions are the leading principal components. These optima inspire PRIM-based bump-hunting algorithms either by minimizing variance or by minimizing volume, thereby motivating an NFLT. We test our results on the Fashion-MNIST database, showing that peeling the largest…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
