SheetCopilot: Bringing Software Productivity to the Next Level through Large Language Models
Hongxin Li, Jingran Su, Yuntao Chen, Qing Li, Zhaoxiang Zhang

TL;DR
SheetCopilot leverages large language models to automate spreadsheet tasks through a novel agent that interprets natural language commands, significantly improving task completion accuracy over baseline methods.
Contribution
Introduces a SheetCopilot agent with a set of atomic actions and a state machine framework for robust spreadsheet control using LLMs, along with a curated dataset and evaluation pipeline.
Findings
Achieves 44.3% task completion rate with a single LLM generation.
Outperforms baseline code generation methods.
Provides a dataset of 221 spreadsheet control tasks.
Abstract
Computer end users have spent billions of hours completing daily tasks like tabular data processing and project timeline scheduling. Most of these tasks are repetitive and error-prone, yet most end users lack the skill to automate these burdensome works. With the advent of large language models (LLMs), directing software with natural language user requests become a reachable goal. In this work, we propose a SheetCopilot agent that takes natural language task and control spreadsheet to fulfill the requirements. We propose a set of atomic actions as an abstraction of spreadsheet software functionalities. We further design a state machine-based task planning framework for LLMs to robustly interact with spreadsheets. We curate a representative dataset containing 221 spreadsheet control tasks and establish a fully automated evaluation pipeline for rigorously benchmarking the ability of LLMs…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsSpreadsheets and End-User Computing · Topic Modeling · Context-Aware Activity Recognition Systems
