Inferring relevant features: from QFT to PCA
C\'edric B\'eny

TL;DR
This paper introduces a data-driven feature extraction method inspired by renormalization in physics, which adapts kernel functions for improved classification performance on unlabeled datasets.
Contribution
It proposes a novel technique that learns an optimal kernel function for feature extraction, bridging concepts from many-body physics and machine learning.
Findings
Learned kernel features outperform Gaussian kernel features in digit classification.
The approach effectively identifies relevant features without labeled data.
The method adapts kernel functions to the data for better feature relevance.
Abstract
In many-body physics, renormalization techniques are used to extract aspects of a statistical or quantum state that are relevant at large scale, or for low energy experiments. Recent works have proposed that these features can be formally identified as those perturbations of the states whose distinguishability most resist coarse-graining. Here, we examine whether this same strategy can be used to identify important features of an unlabeled dataset. This approach indeed results in a technique very similar to kernel PCA (principal component analysis), but with a kernel function that is automatically adapted to the data, or "learned". We test this approach on handwritten digits, and find that the most relevant features are significantly better for classification than those obtained from a simple gaussian kernel.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
