ReAct Meets ActRe: When Language Agents Enjoy Training Data Autonomy
Zonghan Yang, Peng Li, Ming Yan, Ji Zhang, Fei Huang, Yang Liu

TL;DR
This paper introduces A$^3$T, a framework for autonomous annotation of language agent trajectories that enhances self-training and performance in multi-step reasoning tasks without extensive human effort.
Contribution
A$^3$T enables autonomous generation of training trajectories using an ActRe prompting agent, improving language agent self-training and performance through contrastive self-training with minimal human annotation.
Findings
Achieved 96% success rate in AlfWorld with A$^3$T
Matched human performance in WebShop with iterative refinement
Outperformed existing prompting and fine-tuning techniques
Abstract
Language agents have demonstrated autonomous decision-making abilities by reasoning with foundation models. Recently, efforts have been made to train language agents for performance improvement, with multi-step reasoning and action trajectories as the training data. However, collecting such trajectories still requires considerable human effort, by either artificial annotation or implementations of diverse prompting frameworks. In this work, we propose AT, a framework that enables the Autonomous Annotation of Agent Trajectories in the style of ReAct. The central role is an ActRe prompting agent, which explains the reason for an arbitrary action. When randomly sampling an external action, the ReAct-style agent could query the ActRe agent with the action to obtain its textual rationales. Novel trajectories are then synthesized by prepending the posterior reasoning from ActRe to the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMulti-Agent Systems and Negotiation · Reinforcement Learning in Robotics · Semantic Web and Ontologies
MethodsAttention Is All You Need · Layer Normalization · Softmax · Dropout · Byte Pair Encoding · Absolute Position Encodings · Position-Wise Feed-Forward Layer · Residual Connection · Linear Layer · Multi-Head Attention
