QLASS: Boosting Language Agent Inference via Q-Guided Stepwise Search

Zongyu Lin; Yao Tang; Xingcheng Yao; Da Yin; Ziniu Hu and; Yizhou Sun; Kai-Wei Chang

arXiv:2502.02584·cs.LG·February 5, 2025

QLASS: Boosting Language Agent Inference via Q-Guided Stepwise Search

Zongyu Lin, Yao Tang, Xingcheng Yao, Da Yin, Ziniu Hu and, Yizhou Sun, Kai-Wei Chang

PDF

Open Access 1 Repo

TL;DR

QLASS enhances language agent inference by estimating Q-values stepwise, providing intermediate guidance that improves decision-making and performance, especially with limited annotated data.

Contribution

Introducing QLASS, a method that generates intermediate Q-value annotations for language agents, improving long-term decision making and efficiency with less supervision.

Findings

01

Significant performance gains on complex tasks.

02

Effective with nearly half the annotated data.

03

More accurate decision making demonstrated qualitatively.

Abstract

Language agents have become a promising solution to complex interactive tasks. One of the key ingredients to the success of language agents is the reward model on the trajectory of the agentic workflow, which provides valuable guidance during training or inference. However, due to the lack of annotations of intermediate interactions, most existing works use an outcome reward model to optimize policies across entire trajectories. This may lead to sub-optimal policies and hinder the overall performance. To address this, we propose QLASS (Q-guided Language Agent Stepwise Search), to automatically generate annotations by estimating Q-values in a stepwise manner for open language agents. By introducing a reasoning tree and performing process reward modeling, QLASS provides effective intermediate guidance for each step. With the stepwise guidance, we propose a Q-guided generation strategy to…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

rafa-zy/qlass
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Speech and dialogue systems