Explicit Image Caption Editing
Zhen Wang, Long Chen, Wenbo Ma, Guangxing Han, Yulei Niu, Jian Shao,, and Jun Xiao

TL;DR
This paper introduces Explicit Caption Editing (ECE), a new task where models generate explicit edit sequences to refine image captions, making the process more explainable, efficient, and human-like, with a novel TIger model and new benchmarks.
Contribution
It proposes the first explicit caption editing framework with a non-autoregressive transformer model and creates new datasets for ECE research.
Findings
TIger effectively generates caption edits with high accuracy.
Explicit edit sequences improve interpretability and efficiency.
New benchmarks facilitate future ECE research.
Abstract
Given an image and a reference caption, the image caption editing task aims to correct the misalignment errors and generate a refined caption. However, all existing caption editing works are implicit models, ie, they directly produce the refined captions without explicit connections to the reference captions. In this paper, we introduce a new task: Explicit Caption Editing (ECE). ECE models explicitly generate a sequence of edit operations, and this edit operation sequence can translate the reference caption into a refined one. Compared to the implicit editing, ECE has multiple advantages: 1) Explainable: it can trace the whole editing path. 2) Editing Efficient: it only needs to modify a few words. 3) Human-like: it resembles the way that humans perform caption editing, and tries to keep original sentence structures. To solve this new task, we propose the first ECE model: TIger. TIger…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Video Analysis and Summarization · Advanced Image and Video Retrieval Techniques
