DM-Align: Leveraging the Power of Natural Language Instructions to Make Changes to Images
Maria Mihaela Trusca, Tinne Tuytelaars, Marie-Francine Moens

TL;DR
DM-Align is a novel image editing model that uses natural language instructions and explicit word alignments to improve control, transparency, and preservation of image details during editing.
Contribution
It introduces a new method that explicitly reasons about image parts to alter or preserve, enhancing control and explainability in text-based image editing.
Findings
Outperforms state-of-the-art baselines in image editing quality
Better preserves background and handles long instructions
Demonstrates superior quantitative and qualitative results
Abstract
Text-based semantic image editing assumes the manipulation of an image using a natural language instruction. Although recent works are capable of generating creative and qualitative images, the problem is still mostly approached as a black box sensitive to generating unexpected outputs. Therefore, we propose a novel model to enhance the text-based control of an image editor by explicitly reasoning about which parts of the image to alter or preserve. It relies on word alignments between a description of the original source image and the instruction that reflects the needed updates, and the input image. The proposed Diffusion Masking with word Alignments (DM-Align) allows the editing of an image in a transparent and explainable way. It is evaluated on a subset of the Bison dataset and a self-defined dataset dubbed Dream. When comparing to state-of-the-art baselines, quantitative and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Multimodal Machine Learning Applications · Educational Tools and Methods
MethodsDiffusion
