Image Reconstruction as a Tool for Feature Analysis

Eduard Allakhverdov; Dmitrii Tarasov; Elizaveta Goncharova; Andrey Kuznetsov

arXiv:2506.07803·cs.CV·June 10, 2025

Image Reconstruction as a Tool for Feature Analysis

Eduard Allakhverdov, Dmitrii Tarasov, Elizaveta Goncharova, Andrey Kuznetsov

PDF

Open Access

TL;DR

This paper introduces a novel image reconstruction method to interpret and analyze the internal feature representations of various vision encoders, revealing how they encode information like color and image details.

Contribution

It presents a new approach for understanding vision features through image reconstruction and compares different training objectives' effects on feature informativeness.

Findings

01

Encoders trained on image tasks retain more image information.

02

Orthogonal rotations in feature space control color encoding.

03

The method applies broadly to any vision encoder.

Abstract

Vision encoders are increasingly used in modern applications, from vision-only models to multimodal systems such as vision-language models. Despite their remarkable success, it remains unclear how these architectures represent features internally. Here, we propose a novel approach for interpreting vision features via image reconstruction. We compare two related model families, SigLIP and SigLIP2, which differ only in their training objective, and show that encoders pre-trained on image-based tasks retain significantly more image information than those trained on non-image tasks such as contrastive learning. We further apply our method to a range of vision encoders, ranking them by the informativeness of their feature representations. Finally, we demonstrate that manipulating the feature space yields predictable changes in reconstructed images, revealing that orthogonal rotations (rather…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis · Multimodal Machine Learning Applications · Adversarial Robustness in Machine Learning