DCA: Diversified Co-Attention towards Informative Live Video Commenting

Zhihan Zhang; Zhiyi Yin; Shuhuai Ren; Xinhang Li; Shicheng Li

arXiv:1911.02739·cs.CV·August 11, 2020

DCA: Diversified Co-Attention towards Informative Live Video Commenting

Zhihan Zhang, Zhiyi Yin, Shuhuai Ren, Xinhang Li, Shicheng Li

PDF

Open Access

TL;DR

This paper introduces DCA, a model that leverages diversified co-attention mechanisms to improve real-time automatic live video commenting by effectively integrating video frames and viewer comments.

Contribution

The paper proposes a novel Diversified Co-Attention model with an orthogonalization technique for better information integration in live video commenting.

Findings

01

Outperforms existing methods in ALVC task

02

Achieves state-of-the-art results

03

Effectively captures diverse video and comment information

Abstract

We focus on the task of Automatic Live Video Commenting (ALVC), which aims to generate real-time video comments with both video frames and other viewers' comments as inputs. A major challenge in this task is how to properly leverage the rich and diverse information carried by video and text. In this paper, we aim to collect diversified information from video and text for informative comment generation. To achieve this, we propose a Diversified Co-Attention (DCA) model for this task. Our model builds bidirectional interactions between video frames and surrounding comments from multiple perspectives via metric learning, to collect a diversified and informative context for comment generation. We also propose an effective parameter orthogonalization technique to avoid excessive overlap of information learned from different perspectives. Results show that our approach outperforms existing…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Topic Modeling · Video Analysis and Summarization