Learning Compact Reward for Image Captioning

Nannan Li; Zhenzhong Chen

arXiv:2003.10925·cs.CV·March 25, 2020·1 cites

Learning Compact Reward for Image Captioning

Nannan Li, Zhenzhong Chen

PDF

Open Access

TL;DR

This paper introduces rAIRL, a novel adversarial inverse reinforcement learning method that disentangles word rewards and refines training stability to improve diversity and quality in image captioning.

Contribution

The paper proposes a refined adversarial IRL approach that addresses reward ambiguity and mode collapse, enhancing image captioning performance.

Findings

01

Effective disentanglement of word rewards improves caption quality.

02

Enhanced training stability leads to more diverse descriptions.

03

Outperforms existing methods on MS COCO and Flickr30K datasets.

Abstract

Adversarial learning has shown its advances in generating natural and diverse descriptions in image captioning. However, the learned reward of existing adversarial methods is vague and ill-defined due to the reward ambiguity problem. In this paper, we propose a refined Adversarial Inverse Reinforcement Learning (rAIRL) method to handle the reward ambiguity problem by disentangling reward for each word in a sentence, as well as achieve stable adversarial training by refining the loss function to shift the generator towards Nash equilibrium. In addition, we introduce a conditional term in the loss function to mitigate mode collapse and to increase the diversity of the generated descriptions. Our experiments on MS COCO and Flickr30K show that our method can learn compact reward for image captioning.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Human Pose and Action Recognition · Domain Adaptation and Few-Shot Learning