A Cross-Domain Approach for Continuous Impression Recognition from   Dyadic Audio-Visual-Physio Signals

Yuanchao Li; Catherine Lai

arXiv:2203.13932·cs.MM·March 29, 2022·1 cites

A Cross-Domain Approach for Continuous Impression Recognition from Dyadic Audio-Visual-Physio Signals

Yuanchao Li, Catherine Lai

PDF

Open Access

TL;DR

This paper introduces a novel cross-domain architecture for continuous impression recognition that leverages dyadic audio-visual-physio signals, incorporating attention mechanisms and regularization to improve understanding of both emitter and receiver responses.

Contribution

It proposes a new cross-domain attention and regularization framework for impression recognition from dyadic signals, addressing the gap of receiver response analysis.

Findings

01

Achieved a concordance correlation coefficient of 0.770 in competence.

02

Achieved a concordance correlation coefficient of 0.748 in warmth.

03

Validated the effectiveness of the approach through experimental evaluation.

Abstract

The impression we make on others depends not only on what we say, but also, to a large extent, on how we say it. As a sub-branch of affective computing and social signal processing, impression recognition has proven critical in both human-human conversations and spoken dialogue systems. However, most research has studied impressions only from the signals expressed by the emitter, ignoring the response from the receiver. In this paper, we perform impression recognition using a proposed cross-domain architecture on the dyadic IMPRESSION dataset. This improved architecture makes use of cross-domain attention and regularization. The cross-domain attention consists of intra- and inter-attention mechanisms, which capture intra- and inter-domain relatedness, respectively. The cross-domain regularization includes knowledge distillation and similarity enhancement losses, which strengthen the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsEmotion and Mood Recognition · Speech and Audio Processing · Music and Audio Processing

MethodsKnowledge Distillation