AgentTuning: Enabling Generalized Agent Abilities for LLMs
Aohan Zeng, Mingdao Liu, Rui Lu, Bowen Wang, Xiao Liu, Yuxiao Dong,, Jie Tang

TL;DR
AgentTuning is a novel instruction-tuning approach that enhances large language models' abilities to perform complex agent tasks while preserving their general language capabilities, bridging the gap with commercial models.
Contribution
The paper introduces AgentTuning, a simple, general instruction-tuning method with a new dataset, improving LLMs' agent abilities without sacrificing their overall performance.
Findings
AgentTuning enables LLMs to perform complex agent tasks effectively.
AgentLM-70B matches GPT-3.5-turbo on unseen agent tasks.
Open-sourced models and datasets promote accessible development.
Abstract
Open large language models (LLMs) with great performance in various tasks have significantly advanced the development of LLMs. However, they are far inferior to commercial models such as ChatGPT and GPT-4 when acting as agents to tackle complex tasks in the real world. These agent tasks employ LLMs as the central controller responsible for planning, memorization, and tool utilization, necessitating both fine-grained prompting methods and robust LLMs to achieve satisfactory performance. Though many prompting methods have been proposed to complete particular agent tasks, there is lack of research focusing on improving the agent capabilities of LLMs themselves without compromising their general abilities. In this work, we present AgentTuning, a simple and general method to enhance the agent abilities of LLMs while maintaining their general LLM capabilities. We construct AgentInstruct, a…
Peer Reviews
Decision·Submitted to ICLR 2024
- The motivation to improve the agent ability of open-sourced LLM is good. - It is well-written and the idea it presents is clear. - The evaluation is extensive and the results look promising.
- Some details of the dataset construction is unclear. - The training strategy used for instruction-tuning is limited. - The rationale behind some design choice needs more explanations.
1. Agent tuning is an exciting and important direction to study for the LLMs as intelligent agents. 2. The authors' data/training/model have been well-documented. The results should be reproducible
1. Figure 1 (b), I don't think the message is fair for this figure, since you trained on AgentBench (although partly), but the other LLMs have not trained on AgentBench. One of the down-sides for open-source LLMs is the ability to generalize to `different' settings from training, but the proposed work has essentially made AgentBench in-distribution by training. 2. It seems that GPT models are heavily relied on for generating training data. Do we have some sense of how to go beyond GPT models? Su
The paper is written well and easy to follow. It presents a set of expensive experiments, showcasing that open-source LLMs can be competitive with proprietary LLMs when trained on the right data.
While the empirical contribution is significant, the paper overall feels incremental with straightforward improvements over prior instruction tuning and knowledge distillation. Some of the design decisions are also not explained. 1. While the agent trajectories are very valuable and costly to collect, they are mainly extracted from public tasks/benchmarks by using ReAct with GPT models. The overall process with instruction generation, trajectory collection, and filtering can be useful for other
Code & Models
- 🤗zai-org/agentlm-13bmodel· 20 dl· ♡ 2020 dl♡ 20
- 🤗zai-org/agentlm-70bmodel· 115 dl· ♡ 82115 dl♡ 82
- 🤗zai-org/agentlm-7bmodel· 539 dl· ♡ 52539 dl♡ 52
- 🤗TheBloke/agentlm-70B-AWQmodel· 13 dl· ♡ 213 dl♡ 2
- 🤗TheBloke/agentlm-70B-GPTQmodel· 16 dl· ♡ 316 dl♡ 3
- 🤗TheBloke/agentlm-70B-GGUFmodel· 87 dl· ♡ 787 dl♡ 7
- 🤗TheBloke/agentlm-13B-AWQmodel· 4 dl· ♡ 14 dl♡ 1
- 🤗TheBloke/agentlm-13B-GGUFmodel· 173 dl· ♡ 6173 dl♡ 6
- 🤗TheBloke/agentlm-13B-GPTQmodel· 14 dl· ♡ 214 dl♡ 2
- 🤗TheBloke/agentlm-7B-GGUFmodel· 399 dl· ♡ 9399 dl♡ 9
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Artificial Intelligence in Healthcare and Education
MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Multi-Head Attention · Attention Is All You Need · Cosine Annealing · Linear Warmup With Cosine Annealing · Linear Layer · Layer Normalization · Attention Dropout · Softmax · Dense Connections
