PGPO: Enhancing Agent Reasoning via Pseudocode-style Planning Guided Preference Optimization

Zouying Cao; Runze Wang; Yifei Yang; Xinbei Ma; Xiaoyong Zhu; Bo Zheng; Hai Zhao

arXiv:2506.01475·cs.AI·June 3, 2025

PGPO: Enhancing Agent Reasoning via Pseudocode-style Planning Guided Preference Optimization

Zouying Cao, Runze Wang, Yifei Yang, Xinbei Ma, Xiaoyong Zhu, Bo Zheng, Hai Zhao

PDF

Open Access

TL;DR

This paper introduces P-code Plans, a pseudocode-style reasoning framework that improves the generalization and efficiency of LLM agents, and proposes PGPO, a planning-guided preference optimization method that enhances agent performance.

Contribution

The paper presents P-code Plans for structured reasoning and PGPO, a novel training method that significantly boosts LLM agent reasoning quality and generalization.

Findings

01

PGPO outperforms current baselines on benchmark tasks.

02

P-code Plans improve reasoning efficiency and error reduction.

03

PGPO enhances high-quality plan generation and decision accuracy.

Abstract

Large Language Model (LLM) agents have demonstrated impressive capabilities in handling complex interactive problems. Existing LLM agents mainly generate natural language plans to guide reasoning, which is verbose and inefficient. NL plans are also tailored to specific tasks and restrict agents' ability to generalize across similar tasks. To this end, we explore pseudocode-style plans (P-code Plan) to capture the structural logic of reasoning. We find that P-code Plan empowers LLM agents with stronger generalization ability and more efficiency. Inspired by this finding, we propose a pseudocode-style Planning Guided Preference Optimization method called PGPO for effective agent learning. With two planning-oriented rewards, PGPO further enhances LLM agents' ability to generate high-quality P-code Plans and subsequent reasoning. Experiments show that PGPO achieves superior performance on…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAI-based Problem Solving and Planning · Semantic Web and Ontologies · Multi-Agent Systems and Negotiation