TL;DR
This paper introduces History-Echoes, a framework that analyzes how conversational history biases LLMs' future responses, revealing a geometric trap in their latent space that influences behavior persistence.
Contribution
The work presents a novel geometric and probabilistic framework to understand how conversational history biases LLMs, bridging latent space analysis with Markov chain modeling.
Findings
Strong correlation between probabilistic and geometric perspectives.
Behavioral persistence manifests as a geometric trap in latent space.
Gaps in latent space confine the model's trajectory.
Abstract
How does the conversational past of large language models (LLMs) influence their future performance? Recent work suggests that LLMs are affected by their conversational history in unexpected ways. For instance, hallucinations in prior interactions may influence subsequent model responses. In this work, we introduce History-Echoes, a framework that investigates how conversational history biases subsequent generations. The framework explores this bias from two perspectives: probabilistically, we model conversations as Markov chains to quantify state consistency; geometrically, we measure the consistency of consecutive hidden representations. Across three model families and six datasets spanning diverse phenomena, our analysis reveals a strong correlation between the two perspectives. By bridging these perspectives, we demonstrate that behavioral persistence manifests as a geometric trap,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Explainable Artificial Intelligence (XAI) · Artificial Intelligence in Healthcare and Education
