No Metrics Are Perfect: Adversarial Reward Learning for Visual Storytelling
Xin Wang, Wenhu Chen, Yuan-Fang Wang, William Yang Wang

TL;DR
This paper introduces an adversarial reward learning framework for visual storytelling that learns from human demonstrations, resulting in more human-like stories despite only slight improvements in automatic metrics.
Contribution
It proposes a novel AREL framework that learns implicit rewards from human data, addressing limitations of existing reinforcement learning approaches in visual storytelling.
Findings
Achieves better human-like storytelling quality than SOTA methods.
Slight improvements in automatic evaluation metrics.
Human evaluation confirms significant qualitative enhancement.
Abstract
Though impressive results have been achieved in visual captioning, the task of generating abstract stories from photo streams is still a little-tapped problem. Different from captions, stories have more expressive language styles and contain many imaginary concepts that do not appear in the images. Thus it poses challenges to behavioral cloning algorithms. Furthermore, due to the limitations of automatic metrics on evaluating story quality, reinforcement learning methods with hand-crafted rewards also face difficulties in gaining an overall performance boost. Therefore, we propose an Adversarial REward Learning (AREL) framework to learn an implicit reward function from human demonstrations, and then optimize policy search with the learned reward function. Though automatic eval- uation indicates slight performance boost over state-of-the-art (SOTA) methods in cloning expert behaviors,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Human Pose and Action Recognition · Domain Adaptation and Few-Shot Learning
