ReCUBE: Evaluating Repository-Level Context Utilization in Code Generation

Jiseung Hong; Benjamin G. Ascoli; Jinho D. Choi

arXiv:2603.25770·cs.SE·March 30, 2026

ReCUBE: Evaluating Repository-Level Context Utilization in Code Generation

Jiseung Hong, Benjamin G. Ascoli, Jinho D. Choi

PDF

TL;DR

ReCUBE is a benchmark designed to evaluate how effectively large language models utilize repository-level context during code generation, highlighting current challenges and proposing tools to improve exploration.

Contribution

The paper introduces ReCUBE, a novel benchmark for measuring repository-level context utilization, and the CCE toolkit to enhance agent exploration in code generation tasks.

Findings

01

State-of-the-art models struggle with repository-level context, with GPT-5 achieving only 37.57% pass rate.

02

The CCE toolkit improves exploration efficiency, increasing pass rates by up to 7.56%.

03

Repository context utilization remains a significant challenge for current LLMs.

Abstract

Large Language Models (LLMs) have recently emerged as capable coding assistants that operate over large codebases through either agentic exploration or full-context generation. Existing benchmarks capture a broad range of coding capabilities, such as resolving GitHub issues, but none of them directly isolate and measure how effectively LLMs leverage repository-level context during code generation. To address this, we introduce ReCUBE, a benchmark in which LLMs reconstruct a masked file within a real-world repository, using all remaining source files, dependency specifications, and documentation as their only source of context. ReCUBE evaluates reconstructed code with usage-aware test cases that simulate both internal module logic and external cross-file integration, reflecting real-world software usage patterns. We further propose the Caller-Centric Exploration (CCE) toolkit, a set of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.