Learning and Reusing Policy Decompositions for Hierarchical Generalized Planning with LLM Agents
Shirin Sohrabi, Haritha Ananthakrishnan, Harsha Kokel, Kavitha Srinivas, Michael Katz

TL;DR
This paper introduces HCL-GP, a hierarchical policy-learning method that enables LLM agents to learn, decompose, and reuse policies across tasks, significantly improving performance and generalization.
Contribution
It presents a novel dynamic policy-learning framework that combines hierarchical decomposition, reusable component libraries, and semantic search for LLM agents.
Findings
Achieved 98.2% accuracy on normal tasks and 97.8% on challenge tasks in AppWorld benchmark.
Dynamic reuse of components increased success rate to 62.5% for open-source models.
Outperformed static synthesis by 15.8 points on challenging scenarios.
Abstract
We present a dynamic policy-learning approach that combines generalized planning and hierarchical task decomposition for LLM-based agents. Our method, Hierarchical Component Learning for Generalized Policies (HCL-GP ), learns parameterized policies that generalize across task instances and automatically extracts reusable components from successful executions, organizing them into a component library for compositional policy generation. We address three challenges: (1) learning components through automated decomposition, (2) generalizing components to maximize reuse, and (3) efficient retrieval via semantic search. Evaluated on the AppWorld benchmark, our approach achieves 98.2% accuracy on normal tasks and 97.8% on challenge tasks with unseen applications, improving 15.8 points over static synthesis on challenging scenarios. For open-source models, dynamic reuse enables 62.5% success…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
