Dyadic Speech-based Affect Recognition using DAMI-P2C Parent-child Multimodal Interaction Dataset
Huili Chen, Yue Zhang, Felix Weninger, Rosalind Picard and, Cynthia Breazeal, Hae Won Park

TL;DR
This paper presents an end-to-end deep learning approach with attention mechanisms for speech-based affect recognition in dyadic conversations, utilizing a new annotated parent-child dataset to improve multi-speaker affect sensing.
Contribution
The work introduces a novel weighted-pooling attention method and the DAMI-P2C dataset for affect recognition in dyadic interactions, addressing limitations of manual feature extraction.
Findings
Weighted-pooling attention effectively focuses on target speaker's affective regions.
The model accurately predicts valence and arousal in dyadic speech.
The DAMI-P2C dataset provides comprehensive annotations for multi-speaker affect analysis.
Abstract
Automatic speech-based affect recognition of individuals in dyadic conversation is a challenging task, in part because of its heavy reliance on manual pre-processing. Traditional approaches frequently require hand-crafted speech features and segmentation of speaker turns. In this work, we design end-to-end deep learning methods to recognize each person's affective expression in an audio stream with two speakers, automatically discovering features and time regions relevant to the target speaker's affect. We integrate a local attention mechanism into the end-to-end architecture and compare the performance of three attention implementations -- one mean pooling and two weighted pooling methods. Our results show that the proposed weighted-pooling attention solutions are able to learn to focus on the regions containing target speaker's affective information and successfully extract the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
