Auto-GPT for Online Decision Making: Benchmarks and Additional Opinions
Hui Yang, Sifu Yue, Yunzhong He

TL;DR
This paper evaluates Auto-GPT styled agents in real-world decision-making scenarios, compares various LLMs, and introduces the Additional Opinions algorithm to improve performance without fine-tuning.
Contribution
It provides a comprehensive benchmark study of Auto-GPT agents, compares multiple LLMs, and proposes the Additional Opinions algorithm for enhanced decision-making performance.
Findings
Auto-GPT styled agents show varying effectiveness across tasks.
The Additional Opinions algorithm significantly improves decision-making performance.
Benchmark results highlight the adaptability of GPT-based agents in real-world scenarios.
Abstract
Auto-GPT is an autonomous agent that leverages recent advancements in adapting Large Language Models (LLMs) for decision-making tasks. While there has been a growing interest in Auto-GPT stypled agents, questions remain regarding the effectiveness and flexibility of Auto-GPT in solving real-world decision-making tasks. Its limited capability for real-world engagement and the absence of benchmarks contribute to these uncertainties. In this paper, we present a comprehensive benchmark study of Auto-GPT styled agents in decision-making tasks that simulate real-world scenarios. Our aim is to gain deeper insights into this problem and understand the adaptability of GPT-based agents. We compare the performance of popular LLMs such as GPT-4, GPT-3.5, Claude, and Vicuna in Auto-GPT styled decision-making tasks. Furthermore, we introduce the Additional Opinions algorithm, an easy and effective…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Explainable Artificial Intelligence (XAI) · Artificial Intelligence in Healthcare and Education
MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · {Dispute@FaQ-s}How to file a dispute with Expedia? · 15 Ways to Contact How can i speak to someone at Delta Airlines · Cosine Annealing · Label Smoothing · Adam · Absolute Position Encodings · Residual Connection
