Improving Image Captioning with Control Signal of Sentence Quality
Zhangzi Zhu, Hong Qu

TL;DR
This paper introduces a control signal for sentence quality in image captioning models, enabling them to differentiate caption quality levels and improve accuracy without extra ground truth data.
Contribution
It proposes a novel control signal for sentence quality and a reinforcement training method called Q-SAT, enhancing captioning performance.
Findings
Models with the highest quality control outperform baselines.
The control signal improves model awareness of sentence quality.
Q-SAT enhances training effectiveness without additional ground truth data.
Abstract
In the dataset of image captioning, each image is aligned with several descriptions. Despite the fact that the quality of these descriptions varies, existing captioning models treat them equally in the training process. In this paper, we propose a new control signal of sentence quality, which is taken as an additional input to the captioning model. By integrating the control signal information, captioning models are aware of the quality level of the target sentences and handle them differently. Moreover, we propose a novel reinforcement training method specially designed for the control signal of sentence quality: Quality-oriented Self-Annotated Training (Q-SAT). Extensive experiments on MSCOCO dataset show that without extra information from ground truth captions, models controlled by the highest quality level outperform baseline models on accuracy-based evaluation metrics, which…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Advanced Image and Video Retrieval Techniques · Video Analysis and Summarization
MethodsAttentive Walk-Aggregating Graph Neural Network
