Heterogeneous Knowledge Transfer in Video Emotion Recognition, Attribution and Summarization
Baohan Xu, Yanwei Fu, Yu-Gang Jiang, Boyang Li, Leonid Sigal

TL;DR
This paper introduces a novel framework for transferring knowledge from images and text to improve video emotion recognition, attribution, and summarization, addressing challenges posed by unstructured user-generated videos.
Contribution
It is the first to transfer knowledge from heterogeneous sources like images and text to enhance multiple aspects of video emotion understanding.
Findings
Knowledge transfer improves emotion recognition accuracy.
Zero-shot recognition enables identifying unseen emotion classes.
Framework effectively supports emotion attribution and summarization.
Abstract
Emotion is a key element in user-generated videos. However, it is difficult to understand emotions conveyed in such videos due to the complex and unstructured nature of user-generated content and the sparsity of video frames expressing emotion. In this paper, for the first time, we study the problem of transferring knowledge from heterogeneous external sources, including image and textual data, to facilitate three related tasks in understanding video emotion: emotion recognition, emotion attribution and emotion-oriented summarization. Specifically, our framework (1) learns a video encoding from an auxiliary emotional image dataset in order to improve supervised video emotion recognition, and (2) transfers knowledge from an auxiliary textual corpora for zero-shot recognition of emotion classes unseen during training. The proposed technique for knowledge transfer facilitates novel…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
