Text to Automata Diagrams: Comparing TikZ Code Generation with Direct Image Synthesis

Ethan Young; Zichun Wang; Aiden Taylor; Chance Jewell; Julian Myers; Satya Sri Rajiteswari Nimmagadda; Anthony White; Aniruddha Maiti; Ananya Jana

arXiv:2603.07936·cs.CV·March 10, 2026

Text to Automata Diagrams: Comparing TikZ Code Generation with Direct Image Synthesis

Ethan Young, Zichun Wang, Aiden Taylor, Chance Jewell, Julian Myers, Satya Sri Rajiteswari Nimmagadda, Anthony White, Aniruddha Maiti, Ananya Jana

PDF

Open Access

TL;DR

This paper explores the use of vision-language and large language models to convert scanned student diagrams into TikZ code, comparing direct image synthesis with text-based description methods, aiming to improve automated diagram generation in education.

Contribution

It introduces a pipeline combining vision-language models and large language models to generate TikZ diagrams from scanned images, highlighting the impact of human correction on description accuracy.

Findings

01

Descriptions from vision-language models are often inaccurate.

02

Human revision significantly improves description quality.

03

Generated diagrams can be evaluated against original images.

Abstract

Diagrams are widely used in teaching computer science courses. They are useful in subjects such as automata and formal languages, data structures, etc. These diagrams, often drawn by students during exams or assignments, vary in structure, layout, and correctness. This study examines whether current vision-language and large language models can process such diagrams and produce accurate textual and digital representations. In this study, scanned student-drawn diagrams are used as input. Then, textual descriptions are generated from these images using a vision-language model. The descriptions are checked and revised by human reviewers to make them accurate. Both the generated and the revised descriptions are then fed to a large language model to generate TikZ code. The resulting diagrams are compiled and then evaluated against the original scanned diagrams. We found descriptions…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTeaching and Learning Programming · Data Visualization and Analytics · Visual and Cognitive Learning Processes