Discourse Analysis for Evaluating Coherence in Video Paragraph Captions
Arjun R Akula, Song-Chun Zhu

TL;DR
This paper introduces a discourse-based framework and a new dataset, DisNet, for evaluating the coherence of video paragraph captions, focusing on discourse structure rather than just comparison with human annotations.
Contribution
The paper proposes a novel discourse representation approach and introduces DisNet, a dataset with visual discourse annotations for 3000 videos, improving coherence evaluation in video captioning.
Findings
Framework outperforms baseline methods in coherence evaluation
DisNet dataset enables better discourse analysis
Discourse-based evaluation correlates more with human judgment
Abstract
Video paragraph captioning is the task of automatically generating a coherent paragraph description of the actions in a video. Previous linguistic studies have demonstrated that coherence of a natural language text is reflected by its discourse structure and relations. However, existing video captioning methods evaluate the coherence of generated paragraphs by comparing them merely against human paragraph annotations and fail to reason about the underlying discourse structure. At UCLA, we are currently exploring a novel discourse based framework to evaluate the coherence of video paragraphs. Central to our approach is the discourse representation of videos, which helps in modeling coherence of paragraphs conditioned on coherence of videos. We also introduce DisNet, a novel dataset containing the proposed visual discourse annotations of 3000 videos and their paragraphs. Our experiment…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Video Analysis and Summarization · Natural Language Processing Techniques
