CMIS-Net: A Cascaded Multi-Scale Individual Standardization Network for Backchannel Agreement Estimation
Yuxuan Huang, Kangzhong Wang, Eugene Yujun Fu, Grace Ngai, Peter H.F. Ng

TL;DR
This paper introduces CMIS-Net, a multi-scale neural network that normalizes individual differences in backchannel behaviors and improves agreement detection in conversations, advancing human-like AI responsiveness.
Contribution
The paper presents a novel multi-scale normalization approach and an implicit data augmentation module for better backchannel agreement estimation.
Findings
Achieves state-of-the-art performance in backchannel detection.
Effectively handles individual differences and data imbalance.
Demonstrates robustness across diverse conversational contexts.
Abstract
Backchannels are subtle listener responses, such as nods, smiles, or short verbal cues like "yes" or "uh-huh," which convey understanding and agreement in conversations. These signals provide feedback to speakers, improve the smoothness of interaction, and play a crucial role in developing human-like, responsive AI systems. However, the expression of backchannel behaviors is often significantly influenced by individual differences, operating across multiple scales: from instant dynamics such as response intensity (frame-level) to temporal patterns such as frequency and rhythm preferences (sequence-level). This presents a complex pattern recognition problem that contemporary emotion recognition methods have yet to fully address. Particularly, existing individualized methods in emotion recognition often operate at a single scale, overlooking the complementary nature of multi-scale…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsEmotion and Mood Recognition · Sentiment Analysis and Opinion Mining · Social Robot Interaction and HRI
