A Cross-Domain Approach for Continuous Impression Recognition from Dyadic Audio-Visual-Physio Signals
Yuanchao Li, Catherine Lai

TL;DR
This paper introduces a novel cross-domain architecture for continuous impression recognition that leverages dyadic audio-visual-physio signals, incorporating attention mechanisms and regularization to improve understanding of both emitter and receiver responses.
Contribution
It proposes a new cross-domain attention and regularization framework for impression recognition from dyadic signals, addressing the gap of receiver response analysis.
Findings
Achieved a concordance correlation coefficient of 0.770 in competence.
Achieved a concordance correlation coefficient of 0.748 in warmth.
Validated the effectiveness of the approach through experimental evaluation.
Abstract
The impression we make on others depends not only on what we say, but also, to a large extent, on how we say it. As a sub-branch of affective computing and social signal processing, impression recognition has proven critical in both human-human conversations and spoken dialogue systems. However, most research has studied impressions only from the signals expressed by the emitter, ignoring the response from the receiver. In this paper, we perform impression recognition using a proposed cross-domain architecture on the dyadic IMPRESSION dataset. This improved architecture makes use of cross-domain attention and regularization. The cross-domain attention consists of intra- and inter-attention mechanisms, which capture intra- and inter-domain relatedness, respectively. The cross-domain regularization includes knowledge distillation and similarity enhancement losses, which strengthen the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsEmotion and Mood Recognition · Speech and Audio Processing · Music and Audio Processing
MethodsKnowledge Distillation
