Inferring relevant features: from QFT to PCA

C\'edric B\'eny

arXiv:1802.05756·cs.LG·December 20, 2018

Inferring relevant features: from QFT to PCA

C\'edric B\'eny

PDF

TL;DR

This paper introduces a data-driven feature extraction method inspired by renormalization in physics, which adapts kernel functions for improved classification performance on unlabeled datasets.

Contribution

It proposes a novel technique that learns an optimal kernel function for feature extraction, bridging concepts from many-body physics and machine learning.

Findings

01

Learned kernel features outperform Gaussian kernel features in digit classification.

02

The approach effectively identifies relevant features without labeled data.

03

The method adapts kernel functions to the data for better feature relevance.

Abstract

In many-body physics, renormalization techniques are used to extract aspects of a statistical or quantum state that are relevant at large scale, or for low energy experiments. Recent works have proposed that these features can be formally identified as those perturbations of the states whose distinguishability most resist coarse-graining. Here, we examine whether this same strategy can be used to identify important features of an unlabeled dataset. This approach indeed results in a technique very similar to kernel PCA (principal component analysis), but with a kernel function that is automatically adapted to the data, or "learned". We test this approach on handwritten digits, and find that the most relevant features are significantly better for classification than those obtained from a simple gaussian kernel.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.