Aligning Backchannel and Dialogue Context Representations via Contrastive LLM Fine-Tuning

Livia Qian; Gabriel Skantze

arXiv:2604.16622·cs.CL·April 21, 2026

Aligning Backchannel and Dialogue Context Representations via Contrastive LLM Fine-Tuning

Livia Qian, Gabriel Skantze

PDF

TL;DR

This paper introduces a novel two-stage contrastive fine-tuning approach to better align dialogue context representations with backchannel signals, improving retrieval and human perception alignment.

Contribution

It proposes a new contrastive learning framework that jointly embeds dialogue contexts and backchannels, enhancing understanding of their relationship beyond prior methods.

Findings

01

Learned embeddings align more closely with human judgments than raw features.

02

Proposed method significantly improves context-backchannel retrieval.

03

Backchannel form is highly sensitive to extended conversational context.

Abstract

Backchannels (e.g., `yeah', `mhm', and `right') are short, non-interruptive feedback signals whose lexical form and prosody jointly convey pragmatic meaning. While prior computational research has largely focused on predicting backchannel timing, the relationship between lexico-prosodic form and meaning remains underexplored. We propose a two-stage framework: first, fine-tuning large language models on dialogue transcripts to derive rich contextual representations; and second, learning a joint embedding space for dialogue contexts and backchannel realizations. We evaluate alignment with human perception via triadic similarity judgments (prosodic and cross-lexical) and a context-backchannel suitability task. Our results demonstrate that the learned projections substantially improve context-backchannel retrieval compared to previous methods. In addition, they reveal that backchannel form…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.