Learning to Present: Inverse Specification Rewards for Agentic Slide Generation
Karthik Ragunath Ananda Kumar, Subrahmanyam Arunachalam

TL;DR
This paper introduces a reinforcement learning environment for automated slide presentation generation using LLMs, with a novel inverse specification reward that enhances content fidelity and presentation quality.
Contribution
It proposes a new reward system including an inverse specification reward and fine-tunes LLMs with minimal parameters, advancing automated presentation creation.
Findings
The fine-tuned 7B model achieves 91.2% of Claude Opus 4.6's quality.
Instruction adherence and tool-use are key to task performance.
The approach improves over the base model by 33.1%.
Abstract
Automated presentation generation remains a challenging task requiring coherent content creation, visual design, and audience-aware communication. This work proposes an OpenEnv-compatible reinforcement learning environment where LLM agents learn to research topics, plan content, and generate professional HTML slide presentations through tool use. We introduce a multi-component reward system combining structural validation, render quality assessment, LLM-based aesthetic scoring, content quality metrics, and an inverse specification reward that measures how faithfully generated slides convey their intended purpose. The inverse specification reward, an "inverse task" where an LLM attempts to recover the original specification from generated slides, provides a holistic quality signal. Our approach fine-tunes Qwen2.5-Coder-7B via GRPO, training only 0.5% of parameters on prompts derived from…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Generative Adversarial Networks and Image Synthesis · Artificial Intelligence in Games
