TL;DR
ReCode introduces a unified recursive code generation paradigm that enables large language models to seamlessly operate across different decision granularities, improving adaptability and efficiency in hierarchical decision-making tasks.
Contribution
It proposes a novel unified framework that combines planning and action in a recursive code structure, allowing dynamic control over decision granularity in LLM-based agents.
Findings
Outperforms advanced baselines in inference tasks
Demonstrates high data efficiency in training
Enables flexible hierarchical decision-making
Abstract
Real-world tasks require decisions at varying granularities, and humans excel at this by leveraging a unified cognitive representation where planning is fundamentally understood as a high-level form of action. However, current Large Language Model (LLM)-based agents lack this crucial capability to operate fluidly across decision granularities. This limitation stems from existing paradigms that enforce a rigid separation between high-level planning and low-level action, which impairs dynamic adaptability and limits generalization. We propose ReCode (Recursive Code Generation), a novel paradigm that addresses this limitation by unifying planning and action within a single code representation. In this representation, ReCode treats high-level plans as abstract placeholder functions, which the agent then recursively decomposes into finer-grained sub-functions until reaching primitive…
Peer Reviews
Decision·Submitted to ICLR 2026
ReCode unifies high-level planning and low-level action within a single code-based framework, allowing LLM agents to dynamically adjust decision granularity.
1. The proposed approach is closely related to recent advances in code-as-policies paradigms that leverage code-generation LLMs (e.g., [1–5]). However, the paper does not sufficiently analyze or position ReCode against these methods in the related work section. [1] Code as Policies: Language Model Programs for Embodied Control. ICRA 2023. [2] RoboCodex: Multimodal Code Generation for Robotic Behavior Synthesis. ICML 2024. [3] PoAct: Policy and Action Dual-Control Agent for Generalized Applica
* **Simple, unified mechanism.** Both plans and actions are written as code, and the system only recurses when something isn’t executable, easy to reason about and implement. * **Gains at lower cost.** Across benchmarks, the method outperforms baselines at lower cost.
* The novelty over prior similar recursive code work is unclear; it could be better to add a brief comparison to previous works(e.g., REPL-Plan, Code-as-Policies) to make the contribution explicit. * No granulity measurement and analysis.
- The paper is well-motivated and easy to read - ReCode achieves remarkable cost efficiency and training efficience. This is a significant practical advantage.
- While the three environments are diverse, they are all text-based simulation environments. The approach needs validation on more complex, real-world tasks or environments with continuous action spaces. According to the paper's content, it seems possible to have primitive actions at much lower levels of API (for example, continuous actions like moving forward 3.5m). Experiments on this aspect are needed. - The paper doesn't discuss how the system handles errors in code generation or execution f
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
