SlideCoder: Layout-aware RAG-enhanced Hierarchical Slide Generation from Design
Wenxin Tang, Jingyu Xiao, Wenxuan Jiang, Xi Xiao, Yuhang Wang, Xuxin Tang, Qing Li, Yuehe Ma, Junliang Liu, Shisong Tang, Michael R. Lyu

TL;DR
SlideCoder is a novel layout-aware, retrieval-augmented framework that automates slide generation from reference images, significantly improving layout fidelity and visual consistency over existing methods.
Contribution
We introduce SlideCoder, the first layout-aware, retrieval-augmented approach for slide generation, along with a new benchmark and a fine-tuned open-source model.
Findings
Outperforms baselines by up to 40.5 points in layout fidelity and accuracy.
Demonstrates strong performance in visual consistency and code generation.
Introduces a new benchmark with difficulty-tiered samples based on a Slide Complexity Metric.
Abstract
Manual slide creation is labor-intensive and requires expert prior knowledge. Existing natural language-based LLM generation methods struggle to capture the visual and structural nuances of slide designs. To address this, we formalize the Reference Image to Slide Generation task and propose Slide2Code, the first benchmark with difficulty-tiered samples based on a novel Slide Complexity Metric. We introduce SlideCoder, a layout-aware, retrieval-augmented framework for generating editable slides from reference images. SlideCoder integrates a Color Gradient-based Segmentation algorithm and a Hierarchical Retrieval-Augmented Generation method to decompose complex tasks and enhance code generation. We also release SlideMaster, a 7B open-source model fine-tuned with improved reverse-engineered data. Experiments show that SlideCoder outperforms state-of-the-art baselines by up to 40.5 points,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · 3D Shape Modeling and Analysis · Computer Graphics and Visualization Techniques
