Text to Automata Diagrams: Comparing TikZ Code Generation with Direct Image Synthesis
Ethan Young, Zichun Wang, Aiden Taylor, Chance Jewell, Julian Myers, Satya Sri Rajiteswari Nimmagadda, Anthony White, Aniruddha Maiti, Ananya Jana

TL;DR
This paper explores the use of vision-language and large language models to convert scanned student diagrams into TikZ code, comparing direct image synthesis with text-based description methods, aiming to improve automated diagram generation in education.
Contribution
It introduces a pipeline combining vision-language models and large language models to generate TikZ diagrams from scanned images, highlighting the impact of human correction on description accuracy.
Findings
Descriptions from vision-language models are often inaccurate.
Human revision significantly improves description quality.
Generated diagrams can be evaluated against original images.
Abstract
Diagrams are widely used in teaching computer science courses. They are useful in subjects such as automata and formal languages, data structures, etc. These diagrams, often drawn by students during exams or assignments, vary in structure, layout, and correctness. This study examines whether current vision-language and large language models can process such diagrams and produce accurate textual and digital representations. In this study, scanned student-drawn diagrams are used as input. Then, textual descriptions are generated from these images using a vision-language model. The descriptions are checked and revised by human reviewers to make them accurate. Both the generated and the revised descriptions are then fed to a large language model to generate TikZ code. The resulting diagrams are compiled and then evaluated against the original scanned diagrams. We found descriptions…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTeaching and Learning Programming · Data Visualization and Analytics · Visual and Cognitive Learning Processes
