TL;DR
This study investigates whether large language models adapt their language style to match human users, revealing that models often over-converge and that convergence varies with model size and tuning.
Contribution
It provides a systematic analysis of linguistic convergence in LLMs, highlighting differences from human convergence patterns and effects of model size and tuning.
Findings
Models strongly converge to conversation style
Larger and instruction-tuned models converge less
Models often overfit to stylistic features
Abstract
While large language models (LLMs) are generally considered proficient in generating language, how similar their language usage is to that of humans remains understudied. In this paper, we test whether models exhibit linguistic convergence, a core pragmatic element of human language communication: do models adapt, or converge, to the linguistic patterns of their user? To answer this, we systematically compare model completions of existing dialogues to original human responses across sixteen language models, three dialogue corpora, and various stylometric features. We find that models strongly converge to the conversation's style, often significantly overfitting relative to the human baseline. While convergence patterns are often feature-specific, we observe consistent shifts in convergence across modeling settings, with instruction-tuned and larger models converging less than their…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
