A Scalable Benchmark for Repository-Oriented Long-Horizon Conversational Context Management

Yang Liu; Li Zhang; Fang Liu; Ping Lin; Xinyi Li

arXiv:2603.06358·cs.SE·March 9, 2026

A Scalable Benchmark for Repository-Oriented Long-Horizon Conversational Context Management

Yang Liu, Li Zhang, Fang Liu, Ping Lin, Xinyi Li

PDF

Open Access

TL;DR

This paper introduces LoCoEval, a new benchmark for evaluating long-horizon conversational context management in repository-oriented development, highlighting challenges and proposing an improved unified memory approach.

Contribution

It presents the first dedicated benchmark for repository-oriented conversations, evaluates existing methods, and proposes a novel unified memory approach to improve context management.

Findings

01

Existing methods struggle with long-horizon repository conversations.

02

The proposed unified memory approach outperforms baseline methods.

03

LoCoEval provides a realistic evaluation framework for future research.

Abstract

In recent years, large language models (LLMs) have advanced rapidly, substantially enhancing their code understanding and generation capabilities and giving rise to powerful code assistants. However, in practical repository development, excessively long-horizon conversational context may overwhelm models, causing the loss of critical information and degraded performance, thereby limiting the utility of code assistants. Existing context management methods proposed to mitigate this context dilemma primarily target general-purpose conversations, while repository-oriented solutions remain largely unexplored, which is largely due to the lack of reliable evaluation benchmarks. To bridge this gap, we present LoCoEval, the first long-horizon conversational context management benchmark tailored to repository-oriented development scenarios. Adhering to three key principles, LoCoEval is…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpreadsheets and End-User Computing · AI in Service Interactions · Topic Modeling