Multimodal Dyadic Impression Recognition via Listener Adaptive Cross-Domain Fusion
Yuanchao Li, Peter Bell, Catherine Lai

TL;DR
This paper introduces a listener adaptive cross-domain fusion approach for dyadic impression recognition, effectively modeling speaker-listener interactions to improve perception accuracy in conversational AI.
Contribution
It proposes a novel listener adaptive cross-domain architecture that captures the causal relationship between speaker and listener behaviors, enhancing impression recognition.
Findings
Achieved 78.8% and 77.5% concordance correlation coefficients in competence and warmth.
Outperformed previous methods on the dyadic IMPRESSION dataset.
Demonstrated potential for generalization to similar dyadic interactions.
Abstract
As a sub-branch of affective computing, impression recognition, e.g., perception of speaker characteristics such as warmth or competence, is potentially a critical part of both human-human conversations and spoken dialogue systems. Most research has studied impressions only from the behaviors expressed by the speaker or the response from the listener, yet ignored their latent connection. In this paper, we perform impression recognition using a proposed listener adaptive cross-domain architecture, which consists of a listener adaptation function to model the causality between speaker and listener behaviors and a cross-domain fusion function to strengthen their connection. The experimental evaluation on the dyadic IMPRESSION dataset verified the efficacy of our method, producing concordance correlation coefficients of 78.8% and 77.5% in the competence and warmth dimensions, outperforming…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech and dialogue systems · Speech and Audio Processing · Emotion and Mood Recognition
