Pairwise Emotional Relationship Recognition in Drama Videos: Dataset and   Benchmark

Xun Gao; Yin Zhao; Jie Zhang; Longjun Cai

arXiv:2109.11243·cs.CV·September 24, 2021

Pairwise Emotional Relationship Recognition in Drama Videos: Dataset and Benchmark

Xun Gao, Yin Zhao, Jie Zhang, Longjun Cai

PDF

1 Repo

TL;DR

This paper introduces the novel task of Pairwise Emotional Relationship Recognition in videos, presents a large multi-modal dataset called ERATO, and proposes a baseline model with a specialized attention mechanism to advance multi-modal video understanding.

Contribution

The paper defines a new PERR task, creates ERATO, a large-scale multi-modal dataset, and develops the SMTA baseline model to improve multi-modal fusion in emotional relationship recognition.

Findings

01

ERATO contains 31,182 clips and 203 hours of video data.

02

SMTA improves multi-modal fusion performance by about 1%.

03

The dataset and model facilitate research in multi-modal emotion and relationship recognition.

Abstract

Recognizing the emotional state of people is a basic but challenging task in video understanding. In this paper, we propose a new task in this field, named Pairwise Emotional Relationship Recognition (PERR). This task aims to recognize the emotional relationship between the two interactive characters in a given video clip. It is different from the traditional emotion and social relation recognition task. Varieties of information, consisting of character appearance, behaviors, facial emotions, dialogues, background music as well as subtitles contribute differently to the final results, which makes the task more challenging but meaningful in developing more advanced multi-modal models. To facilitate the task, we develop a new dataset called Emotional RelAtionship of inTeractiOn (ERATO) based on dramas and movies. ERATO is a large-scale multi-modal dataset for PERR task, which has 31,182…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

cti-vision/perr
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.