# Visual Story Post-Editing

**Authors:** Ting-Yao Hsu, Chieh-Yang Huang, Yen-Chia Hsu, Ting-Hao 'Kenneth' Huang

arXiv: 1906.01764 · 2019-06-06

## TL;DR

This paper presents VIST-Edit, a new dataset of human edits on machine-generated visual stories, and demonstrates how these edits can improve storytelling models, highlighting the gap between automatic scores and human ratings.

## Contribution

Introduction of VIST-Edit, the first dataset of human edits on visual stories, and baseline methods showing how edits enhance model performance.

## Key findings

- Human edits significantly improve storytelling model outputs.
- Weak correlation between automatic metrics and human ratings.
- Baseline models benefit from small sets of human edits.

## Abstract

We introduce the first dataset for human edits of machine-generated visual stories and explore how these collected edits may be used for the visual story post-editing task. The dataset, VIST-Edit, includes 14,905 human edited versions of 2,981 machine-generated visual stories. The stories were generated by two state-of-the-art visual storytelling models, each aligned to 5 human-edited versions. We establish baselines for the task, showing how a relatively small set of human edits can be leveraged to boost the performance of large visual storytelling models. We also discuss the weak correlation between automatic evaluation scores and human ratings, motivating the need for new automatic metrics.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1906.01764/full.md

## Figures

4 figures with captions in the complete paper: https://tomesphere.com/paper/1906.01764/full.md

## References

28 references — full list in the complete paper: https://tomesphere.com/paper/1906.01764/full.md

---
Source: https://tomesphere.com/paper/1906.01764