Evaluating Compliance with Visualization Guidelines in Diagrams for Scientific Publications Using Large Vision Language Models
Johannes R\"uckert, Louise Bloch, Christoph M. Friedrich

TL;DR
This paper explores the use of large Vision Language Models to automatically analyze scientific diagrams for adherence to visualization guidelines, identifying common issues and potential misinformation sources.
Contribution
It demonstrates that VLMs can effectively evaluate various aspects of diagrams, providing a new automated method for improving scientific visualization quality.
Findings
VLMs accurately analyze diagram types with 82.49% F1-score
They identify issues like missing labels and unnecessary 3D effects
Qwen2.5VL performs best among tested models
Abstract
Diagrams are widely used to visualize data in publications. The research field of data visualization deals with defining principles and guidelines for the creation and use of these diagrams, which are often not known or adhered to by researchers, leading to misinformation caused by providing inaccurate or incomplete information. In this work, large Vision Language Models (VLMs) are used to analyze diagrams in order to identify potential problems in regards to selected data visualization principles and guidelines. To determine the suitability of VLMs for these tasks, five open source VLMs and five prompting strategies are compared using a set of questions derived from selected data visualization guidelines. The results show that the employed VLMs work well to accurately analyze diagram types (F1-score 82.49 %), 3D effects (F1-score 98.55 %), axes labels (F1-score 76.74 %), lines…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
