EXIF as Language: Learning Cross-Modal Associations Between Images and   Camera Metadata

Chenhao Zheng; Ayush Shrivastava; Andrew Owens

arXiv:2301.04647·cs.CV·June 21, 2023

EXIF as Language: Learning Cross-Modal Associations Between Images and Camera Metadata

Chenhao Zheng, Ayush Shrivastava, Andrew Owens

PDF

Open Access

TL;DR

This paper introduces a multimodal embedding model that learns to associate image patches with camera metadata, enabling improved performance on image forensics and calibration tasks, including zero-shot splicing localization.

Contribution

It presents a novel approach to learn cross-modal associations between images and EXIF metadata using a transformer-based model, outperforming existing features.

Findings

01

Significantly better performance on forensics and calibration tasks.

02

Effective zero-shot localization of spliced regions.

03

Outperforms prior self-supervised and supervised methods.

Abstract

We learn a visual representation that captures information about the camera that recorded a given photo. To do this, we train a multimodal embedding between image patches and the EXIF metadata that cameras automatically insert into image files. Our model represents this metadata by simply converting it to text and then processing it with a transformer. The features that we learn significantly outperform other self-supervised and supervised features on downstream image forensics and calibration tasks. In particular, we successfully localize spliced image regions "zero shot" by clustering the visual embeddings for all of the patches within an image.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDigital Media Forensic Detection · Generative Adversarial Networks and Image Synthesis · Anomaly Detection Techniques and Applications