Language Models can Solve Computer Tasks

Geunwoo Kim; Pierre Baldi; Stephen McAleer

arXiv:2303.17491·cs.CL·November 20, 2023·68 cites

Language Models can Solve Computer Tasks

Geunwoo Kim, Pierre Baldi, Stephen McAleer

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper introduces a recursive criticism and improvement (RCI) prompting method for large language models to automate computer tasks using natural language, achieving state-of-the-art results with minimal demonstrations and no task-specific rewards.

Contribution

The paper presents RCI prompting, a novel method enabling LLMs to solve computer tasks efficiently without extensive supervision or reward engineering.

Findings

01

RCI outperforms existing LLM methods on MiniWoB++

02

RCI with InstructGPT-3+RLHF achieves state-of-the-art results

03

RCI enhances reasoning abilities beyond chain of thought prompting

Abstract

Agents capable of carrying out general tasks on a computer can improve efficiency and productivity by automating repetitive tasks and assisting in complex problem-solving. Ideally, such agents should be able to solve new computer tasks presented to them through natural language commands. However, previous approaches to this problem require large amounts of expert demonstrations and task-specific reward functions, both of which are impractical for new tasks. In this work, we show that a pre-trained large language model (LLM) agent can execute computer tasks guided by natural language using a simple prompting scheme where the agent Recursively Criticizes and Improves its output (RCI). The RCI approach significantly outperforms existing LLM methods for automating computer tasks and surpasses supervised learning (SL) and reinforcement learning (RL) approaches on the MiniWoB++ benchmark. We…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

posgnu/rci-agent
noneOfficial

Videos

Language Models can Solve Computer Tasks· slideslive

Taxonomy

TopicsTopic Modeling · Reinforcement Learning in Robotics · Multimodal Machine Learning Applications