Talk to Your Slides: High-Efficiency Slide Editing via Language-Driven Structured Data Manipulation

Kyudan Jung; Hojun Cho; Jooyeol Yun; Soyoung Yang; Jaehyeok Jang; Jaegul Choo

arXiv:2505.11604·cs.CL·May 12, 2026

Talk to Your Slides: High-Efficiency Slide Editing via Language-Driven Structured Data Manipulation

Kyudan Jung, Hojun Cho, Jooyeol Yun, Soyoung Yang, Jaehyeok Jang, Jaegul Choo

PDF

1 Repo

TL;DR

This paper introduces Talk-to-Your-Slides, a language-driven slide editing system that manipulates structured data for efficient, precise, and style-preserving presentation editing, outperforming visual agents in speed and fidelity.

Contribution

It presents a novel hierarchical architecture for slide editing via language, focusing on structured data manipulation instead of visual perception, with a new benchmark dataset.

Findings

01

34% faster processing for text-centric tasks

02

34% better instruction fidelity

03

87% lower operational cost compared to GUI baselines

Abstract

Editing presentation slides is a frequent yet tedious task, ranging from creative layout design to repetitive text maintenance. While recent GUI-based agents powered by Multimodal LLMs (MLLMs) excel at tasks requiring visual perception, such as spatial layout adjustments, they often incur high computational costs and latency when handling structured, text-centric, or batch processing tasks. In this paper, we propose Talk-to-Your-Slides, a high-efficiency slide editing agent that operates via language-driven structured data manipulation rather than relying on the image modality. By leveraging the underlying object model instead of screen pixels, our approach ensures precise content modification while preserving style fidelity, addressing the limitations of OCR-based visual agents. Our system features a hierarchical architecture that effectively bridges high-level user instructions with…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

KyuDan1/Talk-to-Your-Slides
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.