Decoupling entrainment from consistency using deep neural networks
Andreas Weise, Rivka Levitan

TL;DR
This paper introduces neural network-based methods to isolate and measure conversational entrainment by controlling for individual speaker consistency, revealing new insights into social interaction dynamics.
Contribution
It presents novel neural approaches to deconfound consistency in entrainment measurement, improving discrimination of real versus fake interactions.
Findings
Stricter methods correlate with social variables differently from previous measures.
Neural models effectively discriminate real from fake interactions.
Results challenge prior interpretations of entrainment's role in conversation quality.
Abstract
Human interlocutors tend to engage in adaptive behavior known as entrainment to become more similar to each other. Isolating the effect of consistency, i.e., speakers adhering to their individual styles, is a critical part of the analysis of entrainment. We propose to treat speakers' initial vocal features as confounds for the prediction of subsequent outputs. Using two existing neural approaches to deconfounding, we define new measures of entrainment that control for consistency. These successfully discriminate real interactions from fake ones. Interestingly, our stricter methods correlate with social variables in opposite direction from previous measures that do not account for consistency. These results demonstrate the advantages of using neural networks to model entrainment, and raise questions regarding how to interpret prior associations of conversation quality with entrainment…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Speech and dialogue systems · Speech Recognition and Synthesis
