Spreadsheet-RL: Advancing Large Language Model Agents on Realistic Spreadsheet Tasks via Reinforcement Learning

Banghao Chi; Yining Xie; Mingyuan Wu; Jingcheng Yang; Jize Jiang; Zhaoheng Li; Shengyi Qian; Minjia Zhang; Klara Nahrstedt; Rui Hou; Xiangjun Fan; and Hanchao Yu

arXiv:2605.22642·cs.AI·May 22, 2026

Spreadsheet-RL: Advancing Large Language Model Agents on Realistic Spreadsheet Tasks via Reinforcement Learning

Banghao Chi, Yining Xie, Mingyuan Wu, Jingcheng Yang, Jize Jiang, Zhaoheng Li, Shengyi Qian, Minjia Zhang, Klara Nahrstedt, Rui Hou, Xiangjun Fan, and Hanchao Yu

PDF

1 Repo 1 Models 1 Datasets

TL;DR

Spreadsheet-RL introduces a reinforcement learning framework to train specialized AI agents for complex, real-world spreadsheet tasks within Excel, significantly improving performance over existing methods.

Contribution

The paper presents a novel RL fine-tuning framework, a scalable data collection pipeline, and a comprehensive environment for training and evaluating spreadsheet agents.

Findings

01

Spreadsheet-RL improves Pass@1 from 12.0% to 23.4% on SpreadsheetBench.

02

It raises Pass@1 from 8.4% to 17.2% on the Domain-Spreadsheet dataset.

03

The framework demonstrates strong potential for real-world spreadsheet automation.

Abstract

Spreadsheet systems (e.g., Microsoft Excel, Google Sheets) play a central role in modern data-centric workflows. As AI agents grow increasingly capable of automating complex tasks, such as controlling computers and generating presentations, building an AI-driven spreadsheet agent has emerged as a promising research direction. Most existing spreadsheet agents rely on specialized prompting over general-purpose LLMs; while this design has potentials on simple spreadsheet operations, it struggles to manage the complex, multi-step workflows typical of real-world applications. We introduce Spreadsheet-RL, a reinforcement learning (RL) fine-tuning framework designed to train specialized spreadsheet agents within a realistic Microsoft Excel environment. Spreadsheet-RL features an automated pipeline for scalable collection of paired start-goal spreadsheets from online forums, as well as…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

spreadsheet-rl/Spreadsheet-RL
github

Models

🤗
Spreadsheet-RL/Spreadsheet-RL-4B
model· 46 dl· ♡ 2
46 dl♡ 2

Datasets

Spreadsheet-RL/Spreadsheet-RL
dataset· 3.6k dl
3.6k dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.