An Empirical Study of Interaction Smells in Multi-Turn Human-LLM Collaborative Code Generation

Binquan Zhang; Li Zhang; Lin Shi; Song Wang; Yuwei Qian; Linhui Zhao; Fang Liu; An Fu; Yida Ye

arXiv:2603.09701·cs.SE·March 31, 2026

An Empirical Study of Interaction Smells in Multi-Turn Human-LLM Collaborative Code Generation

Binquan Zhang, Li Zhang, Lin Shi, Song Wang, Yuwei Qian, Linhui Zhao, Fang Liu, An Fu, Yida Ye

PDF

TL;DR

This paper investigates Interaction Smells in multi-turn human-LLM code generation, categorizes them, analyzes their distribution across models, and proposes a framework to mitigate these issues, improving interaction quality.

Contribution

It introduces the first taxonomy of Interaction Smells, evaluates their prevalence across models, and proposes a novel multi-agent framework to reduce these interaction issues.

Findings

01

Interaction Smells are categorized into three main types with nine subcategories.

02

Distribution of Interaction Smells varies significantly among different LLMs.

03

The proposed InCE framework improves task success rate and reduces Interaction Smells.

Abstract

Large Language Models (LLMs) have revolutionized code generation, evolving from static tools into dynamic conversational interfaces that facilitate complex, multi-turn collaborative programming. While LLMs exhibit remarkable proficiency in generating standalone code snippets, they often struggle to maintain contextual consistency during extended interactions, creating significant obstacles in the collaboration process. Existing benchmarks primarily emphasize the functional correctness of the final output, overlooking latent quality issues within the interaction process itself, which we term Interaction Smells. In this paper, we conduct an empirical study on sampled real-word user-LLM interactions from WildChat and LMSYS-Chat-1M datasets to systematically investigate Interaction Smells in human-LLM code generation tasks from the perspectives of phenomena, distribution, and mitigation.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.