Expert and Crowd-Guided Affect Annotation and Prediction

Ramanathan Subramanian; Yan Yan; Nicu Sebe

arXiv:2112.08432·cs.MM·December 17, 2021

Expert and Crowd-Guided Affect Annotation and Prediction

Ramanathan Subramanian, Yan Yan, Nicu Sebe

PDF

Open Access

TL;DR

This paper introduces a novel expert-guided multi-task learning framework that leverages crowdsourced and expert affect annotations to improve emotion prediction accuracy in videos.

Contribution

It proposes the EG-MTL algorithm that effectively combines crowd and expert labels within a multi-task learning setup for affective computing.

Findings

01

EG-MTL improves arousal and valence estimation accuracy.

02

EG-MTL enhances binary emotion recognition performance.

03

Experimental results validate the effectiveness of the proposed method.

Abstract

We employ crowdsourcing to acquire time-continuous affective annotations for movie clips, and refine noisy models trained from these crowd annotations incorporating expert information within a Multi-task Learning (MTL) framework. We propose a novel \textbf{e}xpert \textbf{g}uided MTL (EG-MTL) algorithm, which minimizes the loss with respect to both crowd and expert labels to learn a set of weights corresponding to each movie clip for which crowd annotations are acquired. We employ EG-MTL to solve two problems, namely, \textbf{\texttt{P1}}: where dynamic annotations acquired from both experts and crowdworkers for the \textbf{Validation} set are used to train a regression model with audio-visual clip descriptors as features, and predict dynamic arousal and valence levels on 5--15 second snippets derived from the clips; and \textbf{\texttt{P2}}: where a classification model trained on the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMusic and Audio Processing · Speech and Audio Processing · Video Analysis and Summarization