FractalBench: Diagnosing Visual-Mathematical Reasoning Through Recursive Program Synthesis
Jan Ondras (1), and Marek \v{S}uppa (2) ((1) MIT, (2) Comenius University, Cisco)

TL;DR
FractalBench is a benchmark that assesses multimodal AI systems' ability to synthesize recursive fractal programs from images, revealing significant gaps in mathematical abstraction despite syntactic code validity.
Contribution
This work introduces FractalBench, a novel diagnostic benchmark for evaluating visual-mathematical reasoning in multimodal models through fractal program synthesis.
Findings
76% of models generate valid code but only 4% capture mathematical structure
Models perform better on geometric transformations (17-21%) than on recursive branching (<2%)
Reveals fundamental gaps in AI models' ability to abstract mathematical rules from visual patterns
Abstract
Mathematical reasoning requires abstracting symbolic rules from visual patterns -- inferring the infinite from the finite. We investigate whether multimodal AI systems possess this capability through FractalBench, a benchmark evaluating fractal program synthesis from images. Fractals provide ideal test cases: Iterated Function Systems with only a few contraction maps generate complex self-similar patterns through simple recursive rules, requiring models to bridge visual perception with mathematical abstraction. We evaluate four leading MLLMs -- GPT-4o, Claude 3.7 Sonnet, Gemini 2.5 Flash, and Qwen 2.5-VL -- on 12 canonical fractals. Models must generate executable Python code reproducing the fractal, enabling objective evaluation. Results reveal a striking disconnect: 76% generate syntactically valid code but only 4% capture mathematical structure. Success varies systematically --…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsCognitive and developmental aspects of mathematical skills · Evolutionary Algorithms and Applications · Teaching and Learning Programming
