TheoremExplainAgent: Towards Video-based Multimodal Explanations for LLM Theorem Understanding
Max Ku, Thomas Chong, Jonathan Leung, Krish Shah, Alvin Yu, Wenhu Chen

TL;DR
This paper introduces TheoremExplainAgent, a system for generating detailed video explanations of theorems using LLMs and animations, and evaluates it with a new benchmark and metrics, revealing the importance of multimodal explanations for understanding.
Contribution
It presents TheoremExplainAgent for multimodal theorem explanations and TheoremExplainBench for systematic evaluation, advancing the generation of pedagogically meaningful visual explanations.
Findings
Agentic planning improves long-form video generation.
The o3-mini agent achieves 93.8% success rate.
Videos have minor layout issues but reveal reasoning flaws.
Abstract
Understanding domain-specific theorems often requires more than just text-based reasoning; effective communication through structured visual explanations is crucial for deeper comprehension. While large language models (LLMs) demonstrate strong performance in text-based theorem reasoning, their ability to generate coherent and pedagogically meaningful visual explanations remains an open challenge. In this work, we introduce TheoremExplainAgent, an agentic approach for generating long-form theorem explanation videos (over 5 minutes) using Manim animations. To systematically evaluate multimodal theorem explanations, we propose TheoremExplainBench, a benchmark covering 240 theorems across multiple STEM disciplines, along with 5 automated evaluation metrics. Our results reveal that agentic planning is essential for generating detailed long-form videos, and the o3-mini agent achieves a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsNatural Language Processing Techniques · Semantic Web and Ontologies · Software Engineering Research
