Auto-GPT for Online Decision Making: Benchmarks and Additional Opinions

Hui Yang; Sifu Yue; Yunzhong He

arXiv:2306.02224·cs.AI·June 6, 2023·39 cites

Auto-GPT for Online Decision Making: Benchmarks and Additional Opinions

Hui Yang, Sifu Yue, Yunzhong He

PDF

Open Access 1 Repo

TL;DR

This paper evaluates Auto-GPT styled agents in real-world decision-making scenarios, compares various LLMs, and introduces the Additional Opinions algorithm to improve performance without fine-tuning.

Contribution

It provides a comprehensive benchmark study of Auto-GPT agents, compares multiple LLMs, and proposes the Additional Opinions algorithm for enhanced decision-making performance.

Findings

01

Auto-GPT styled agents show varying effectiveness across tasks.

02

The Additional Opinions algorithm significantly improves decision-making performance.

03

Benchmark results highlight the adaptability of GPT-based agents in real-world scenarios.

Abstract

Auto-GPT is an autonomous agent that leverages recent advancements in adapting Large Language Models (LLMs) for decision-making tasks. While there has been a growing interest in Auto-GPT stypled agents, questions remain regarding the effectiveness and flexibility of Auto-GPT in solving real-world decision-making tasks. Its limited capability for real-world engagement and the absence of benchmarks contribute to these uncertainties. In this paper, we present a comprehensive benchmark study of Auto-GPT styled agents in decision-making tasks that simulate real-world scenarios. Our aim is to gain deeper insights into this problem and understand the adaptability of GPT-based agents. We compare the performance of popular LLMs such as GPT-4, GPT-3.5, Claude, and Vicuna in Auto-GPT styled decision-making tasks. Furthermore, we introduce the Additional Opinions algorithm, an easy and effective…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

younghuman/llmagent
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Explainable Artificial Intelligence (XAI) · Artificial Intelligence in Healthcare and Education

MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · {Dispute@FaQ-s}How to file a dispute with Expedia? · 15 Ways to Contact How can i speak to someone at Delta Airlines · Cosine Annealing · Label Smoothing · Adam · Absolute Position Encodings · Residual Connection