Expert and Crowd-Guided Affect Annotation and Prediction
Ramanathan Subramanian, Yan Yan, Nicu Sebe

TL;DR
This paper introduces a novel expert-guided multi-task learning framework that leverages crowdsourced and expert affect annotations to improve emotion prediction accuracy in videos.
Contribution
It proposes the EG-MTL algorithm that effectively combines crowd and expert labels within a multi-task learning setup for affective computing.
Findings
EG-MTL improves arousal and valence estimation accuracy.
EG-MTL enhances binary emotion recognition performance.
Experimental results validate the effectiveness of the proposed method.
Abstract
We employ crowdsourcing to acquire time-continuous affective annotations for movie clips, and refine noisy models trained from these crowd annotations incorporating expert information within a Multi-task Learning (MTL) framework. We propose a novel \textbf{e}xpert \textbf{g}uided MTL (EG-MTL) algorithm, which minimizes the loss with respect to both crowd and expert labels to learn a set of weights corresponding to each movie clip for which crowd annotations are acquired. We employ EG-MTL to solve two problems, namely, \textbf{\texttt{P1}}: where dynamic annotations acquired from both experts and crowdworkers for the \textbf{Validation} set are used to train a regression model with audio-visual clip descriptors as features, and predict dynamic arousal and valence levels on 5--15 second snippets derived from the clips; and \textbf{\texttt{P2}}: where a classification model trained on the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMusic and Audio Processing · Speech and Audio Processing · Video Analysis and Summarization
