Processing and acquisition traces in visual encoders: What does CLIP know about your camera?
Ryan Ramos, Vladan Stojni\'c, Giorgos Kordopatis-Zilos, Yuta Nakashima, Giorgos Tolias, Noa Garcia

TL;DR
This paper investigates how visual encoders like CLIP encode subtle camera and image acquisition parameters, revealing their influence on semantic predictions and the potential for these parameters to be recovered from learned representations.
Contribution
It demonstrates that acquisition and processing parameters are systematically encoded in visual representations and can significantly affect semantic predictions, highlighting a new dimension of interpretability.
Findings
Acquisition parameters are systematically encoded in visual representations.
These parameters can be recovered from the learned features.
Their presence can positively or negatively influence semantic predictions.
Abstract
Prior work has analyzed the robustness of visual encoders to image transformations and corruptions, particularly in cases where such alterations are not seen during training. When this occurs, they introduce a form of distribution shift at test time, often leading to performance degradation. The primary focus has been on severe corruptions that, when applied aggressively, distort useful signals necessary for accurate semantic predictions. We take a different perspective by analyzing parameters of the image acquisition process and transformations that may be subtle or even imperceptible to the human eye. We find that such parameters are systematically encoded in the learned visual representations and can be easily recovered. More strikingly, their presence can have a profound impact, either positively or negatively, on semantic predictions. This effect depends on whether there is a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
