SheetAgent: Towards A Generalist Agent for Spreadsheet Reasoning and Manipulation via Large Language Models
Yibin Chen, Yifu Yuan, Zeyu Zhang, Yan Zheng, Jinyi Liu, Fei Ni,, Jianye Hao, Hangyu Mao, Fuzheng Zhang

TL;DR
SheetAgent is a novel autonomous system utilizing large language models to perform complex, multi-step spreadsheet reasoning and manipulation, significantly improving accuracy and reasoning capabilities in realistic, long-horizon tasks.
Contribution
The paper introduces SheetAgent, a new LLM-based autonomous agent with a three-module architecture for advanced spreadsheet reasoning and manipulation, and presents a challenging benchmark, SheetRM, for realistic tasks.
Findings
SheetAgent achieves 20-40% higher pass rates on benchmarks.
It demonstrates superior reasoning and manipulation accuracy.
The approach reduces human intervention in complex spreadsheet tasks.
Abstract
Spreadsheets are ubiquitous across the World Wide Web, playing a critical role in enhancing work efficiency across various domains. Large language model (LLM) has been recently attempted for automatic spreadsheet manipulation but has not yet been investigated in complicated and realistic tasks where reasoning challenges exist (e.g., long horizon manipulation with multi-step reasoning and ambiguous requirements). To bridge the gap with the real-world requirements, we introduce SheetRM, a benchmark featuring long-horizon and multi-category tasks with reasoning-dependent manipulation caused by real-life challenges. To mitigate the above challenges, we further propose SheetAgent, a novel autonomous agent that utilizes the power of LLMs. SheetAgent consists of three collaborative modules: Planner, Informer, and Retriever, achieving both advanced reasoning and accurate manipulation over…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpreadsheets and End-User Computing
