PRIM-cipal components analysis

Tianhao Liu; Daniel Andr\'es D\'iaz-Pach\'on; J. Sunil Rao

arXiv:2604.15538·stat.ML·April 20, 2026

PRIM-cipal components analysis

Tianhao Liu, Daniel Andr\'es D\'iaz-Pach\'on, J. Sunil Rao

PDF

TL;DR

This paper explores unsupervised NFLTs for elliptical distributions, revealing two optimal, opposite bump-hunting strategies based on principal components, and demonstrates their practical implications on Fashion-MNIST.

Contribution

It introduces a novel NFLT for unsupervised bump-hunting using principal components, with algorithms that optimize variance or volume, supported by empirical testing.

Findings

01

Peeling smallest principal components captures multiplicity.

02

Peeling largest principal components isolates popular styles.

03

Theoretical results motivate new bump-hunting algorithms.

Abstract

Supervised No Free Lunch Theorems (NFLTs) are well studied, yet unsupervised NFLTs remain underexplored. For elliptical distributions, we prove that there exist two equally optimal, scientifically meaningful bump-hunting strategies that are exact opposites, with no universal winner. Specifically, peeling $k$ orthogonal dimensions from $R^{d}$ ( $d \geq k$ ), retaining an inter-quantile region of probability $1 - α$ per peeled dimension, maximizes total variance and Frobenius norm when the $k$ smallest principal components (called pettiest components) are selected, and minimizes them when the selected dimensions are the $k$ leading principal components. These optima inspire PRIM-based bump-hunting algorithms either by minimizing variance or by minimizing volume, thereby motivating an NFLT. We test our results on the Fashion-MNIST database, showing that peeling the largest…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.