Beyond the Assistant Turn: User Turn Generation as a Probe of Interaction Awareness in Language Models

Sarath Shekkizhar; Romain Cosentino; Adam Earle

arXiv:2604.02315·cs.AI·April 6, 2026

Beyond the Assistant Turn: User Turn Generation as a Probe of Interaction Awareness in Language Models

Sarath Shekkizhar, Romain Cosentino, Adam Earle

PDF

TL;DR

This paper introduces user-turn generation as a new method to probe whether language models encode awareness of ongoing interactions, revealing a dimension of behavior not captured by traditional benchmarks.

Contribution

It proposes a novel probe for interaction awareness in LLMs, demonstrating that models can generate grounded user follow-ups, which is often latent in standard evaluation.

Findings

01

Interaction awareness is decoupled from task accuracy.

02

Higher temperature sampling reveals latent interaction awareness.

03

Post-training improves models' follow-up generation rates.

Abstract

Standard LLM benchmarks evaluate the assistant turn: the model generates a response to an input, a verifier scores correctness, and the analysis ends. This paradigm leaves unmeasured whether the LLM encodes any awareness of what follows the assistant response. We propose user-turn generation as a probe of this gap: given a conversation context of user query and assistant response, we let a model generate under the user role. If the model's weights encode interaction awareness, the generated user turn will be a grounded follow-up that reacts to the preceding context. Through experiments across $11$ open-weight LLMs (Qwen3.5, gpt-oss, GLM) and $5$ datasets (math reasoning, instruction following, conversation), we show that interaction awareness is decoupled from task accuracy. In particular, within the Qwen3.5 family, GSM8K accuracy scales from $41%$ ( $0.8$ B) to $96.8%$ ( $397$ B-A $17$ B),…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.