DeTikZify: Synthesizing Graphics Programs for Scientific Figures and Sketches with TikZ
Jonas Belouadi, Simone Paolo Ponzetto, Steffen Eger

TL;DR
DeTikZify is a multimodal model that automatically converts sketches and figures into high-quality TikZ graphics, significantly easing the creation of scientific figures by leveraging new datasets and an iterative refinement algorithm.
Contribution
We introduce DeTikZify, a novel model with datasets and an MCTS-based refinement method for converting sketches and figures into semantic TikZ graphics.
Findings
DeTikZify outperforms Claude 3 and GPT-4V in TikZ synthesis.
The MCTS algorithm improves output quality through iterative refinement.
We provide large datasets and code publicly for further research.
Abstract
Creating high-quality scientific figures can be time-consuming and challenging, even though sketching ideas on paper is relatively easy. Furthermore, recreating existing figures that are not stored in formats preserving semantic information is equally complex. To tackle this problem, we introduce DeTikZify, a novel multimodal language model that automatically synthesizes scientific figures as semantics-preserving TikZ graphics programs based on sketches and existing figures. To achieve this, we create three new datasets: DaTikZv2, the largest TikZ dataset to date, containing over 360k human-created TikZ graphics; SketchFig, a dataset that pairs hand-drawn sketches with their corresponding scientific figures; and MetaFig, a collection of diverse scientific figures and associated metadata. We train DeTikZify on MetaFig and DaTikZv2, along with synthetically generated sketches learned from…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsTeaching and Learning Programming · Interactive and Immersive Displays · Augmented Reality Applications
