DeepCode: Open Agentic Coding
Zongwei Li, Zhonghang Li, Zirui Guo, Xubin Ren, Chao Huang

TL;DR
DeepCode introduces a novel autonomous framework that manages information flow to improve scientific paper-to-code synthesis, outperforming existing agents and human experts in generating high-quality code from research papers.
Contribution
It presents a new information-flow management approach for LLM-based code synthesis, addressing context limitations and achieving state-of-the-art results on scientific paper benchmarks.
Findings
Outperforms commercial agents like Cursor and Claude Code.
Surpasses PhD-level human experts on key metrics.
Transforms paper specifications into high-quality code autonomously.
Abstract
Recent advances in large language models (LLMs) have given rise to powerful coding agents, making it possible for code assistants to evolve into code engineers. However, existing methods still face significant challenges in achieving high-fidelity document-to-codebase synthesis--such as scientific papers to code--primarily due to a fundamental conflict between information overload and the context bottlenecks of LLMs. In this work, we introduce DeepCode, a fully autonomous framework that fundamentally addresses this challenge through principled information-flow management. By treating repository synthesis as a channel optimization problem, DeepCode seamlessly orchestrates four information operations to maximize task-relevant signals under finite context budgets: source compression via blueprint distillation, structured indexing using stateful code memory, conditional knowledge injection…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning in Materials Science · Topic Modeling · Machine Learning and Algorithms
