From Pixels to Components: Eigenvector Masking for Visual Representation   Learning

Alice Bizeul; Thomas Sutter; Alain Ryser; Bernhard Sch\"olkopf; Julius; von K\"ugelgen; Julia E. Vogt

arXiv:2502.06314·cs.LG·February 12, 2025·2 cites

From Pixels to Components: Eigenvector Masking for Visual Representation Learning

Alice Bizeul, Thomas Sutter, Alain Ryser, Bernhard Sch\"olkopf, Julius, von K\"ugelgen, Julia E. Vogt

PDF

Open Access 1 Repo

TL;DR

This paper introduces a novel masked image modeling approach that masks principal components instead of pixels, leading to better high-level feature learning and improved image classification performance.

Contribution

It proposes a PCA-based masking strategy that enhances the learning of global, high-level features over traditional pixel-based masking methods.

Findings

01

Improved image classification accuracy with component masking

02

Component masking captures more global information than pixel masking

03

Demonstrates robustness and simplicity of the PCA-based approach

Abstract

Predicting masked from visible parts of an image is a powerful self-supervised approach for visual representation learning. However, the common practice of masking random patches of pixels exhibits certain failure modes, which can prevent learning meaningful high-level features, as required for downstream tasks. We propose an alternative masking strategy that operates on a suitable transformation of the data rather than on the raw pixels. Specifically, we perform principal component analysis and then randomly mask a subset of components, which accounts for a fixed ratio of the data variance. The learning task then amounts to reconstructing the masked components from the visible ones. Compared to local patches of pixels, the principal components of images carry more global information. We thus posit that predicting masked from visible components involves more high-level features,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

alicebizeul/pmae
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsImage Retrieval and Classification Techniques · Advanced Image and Video Retrieval Techniques