Disentanglement for audio-visual emotion recognition using multitask   setup

Raghuveer Peri; Srinivas Parthasarathy; Charles Bradshaw; Shiva; Sundaram

arXiv:2102.06269·eess.IV·February 15, 2021

Disentanglement for audio-visual emotion recognition using multitask setup

Raghuveer Peri, Srinivas Parthasarathy, Charles Bradshaw, Shiva, Sundaram

PDF

TL;DR

This paper proposes a multitask learning framework for audio-visual emotion recognition that disentangles emotion-specific features from person identity information, improving interpretability without sacrificing accuracy.

Contribution

It introduces a novel disentanglement approach within a multitask setup to isolate emotion-related features from identity cues in multimodal data.

Findings

01

Achieved up to 13% disentanglement of features.

02

Maintained state-of-the-art emotion recognition performance.

03

Compared three disentanglement techniques.

Abstract

Deep learning models trained on audio-visual data have been successfully used to achieve state-of-the-art performance for emotion recognition. In particular, models trained with multitask learning have shown additional performance improvements. However, such multitask models entangle information between the tasks, encoding the mutual dependencies present in label distributions in the real world data used for training. This work explores the disentanglement of multimodal signal representations for the primary task of emotion recognition and a secondary person identification task. In particular, we developed a multitask framework to extract low-dimensional embeddings that aim to capture emotion specific information, while containing minimal information related to person identity. We evaluate three different techniques for disentanglement and report results of up to 13% disentanglement…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.