Adapting LLM Agents with Universal Feedback in Communication

Kuan Wang; Yadong Lu; Michael Santacroce; Yeyun Gong; Chao Zhang,; Yelong Shen

arXiv:2310.01444·cs.CL·April 16, 2024·6 cites

Adapting LLM Agents with Universal Feedback in Communication

Kuan Wang, Yadong Lu, Michael Santacroce, Yeyun Gong, Chao Zhang,, Yelong Shen

PDF

Open Access

TL;DR

This paper introduces Learning through Communication (LTC), a universal feedback-based training method for LLM agents that improves their adaptability and performance across diverse tasks and environments.

Contribution

The paper proposes LTC, a universal feedback buffer and iterative pipeline, enabling LLM agents to learn from linguistic and non-linguistic signals in various multi-agent and single-agent settings.

Findings

01

LTC outperforms supervised fine-tuning by 3.6% to 12% on multiple datasets.

02

LTC effectively integrates diverse feedback types for improved agent learning.

03

The approach demonstrates versatility across different task environments.

Abstract

Recent advances in large language models (LLMs) have demonstrated potential for LLM agents. To facilitate the training for these agents with both linguistic feedback and non-linguistic reward signals, we introduce Learning through Communication (LTC). We design a universal buffer to store all the feedback, and an iterative pipeline to enable an LLM agent to explore and update its policy in an given environment. To optimize agent interactions for task-specific learning with our universal buffer and pipeline, we introduce diverse communication patterns tailored for both single-agent and multi-agent environments. We evaluate the efficacy of our LTC approach on four diverse datasets: ALFWorld (single-agent), HotpotQA (multi-agent collaboration), Chameleon (multi-agent competition), and GSM8k (multi-agent teacher-student). On these data sets, LTC outperforms the supervised instruction…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMulti-Agent Systems and Negotiation · Semantic Web and Ontologies · Business Process Modeling and Analysis

MethodsEntropy Regularization · Proximal Policy Optimization