TL;DR
Delta-Adapter introduces a scalable image editing approach that learns from single image pairs by extracting semantic deltas with a pre-trained encoder, enabling effective transfer of editing semantics without requiring multiple exemplar pairs.
Contribution
It proposes a novel single-pair supervision method using semantic deltas and a Perceiver-based adapter, improving generalization and editing accuracy over existing multi-pair supervised methods.
Findings
Outperforms four strong baselines on seen editing tasks.
Generalizes more effectively to unseen editing tasks.
Leverages large-scale datasets without needing multiple exemplar pairs.
Abstract
Exemplar-based image editing applies a transformation defined by a source-target image pair to a new query image. Existing methods rely on a pair-of-pairs supervision paradigm, requiring two image pairs sharing the same edit semantics to learn the target transformation. This constraint makes training data difficult to curate at scale and limits generalization across diverse edit types. We propose Delta-Adapter, a method that learns transferable editing semantics under single-pair supervision, requiring no textual guidance. Rather than directly exposing the exemplar pair to the model, we leverage a pre-trained vision encoder to extract a semantic delta that encodes the visual transformation between the two images. This semantic delta is injected into a pre-trained image editing model via a Perceiver-based adapter. Since the target image is never directly visible to the model, it can…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
