Exploring Graph Structure Comprehension Ability of Multimodal Large Language Models: Case Studies
Zhiqiang Zhong, Davide Mottin

TL;DR
This paper investigates how multimodal large language models, which process both text and images, understand graph structures better than text-only models, by analyzing their performance on various benchmark tasks involving graph visualizations.
Contribution
It provides the first systematic analysis of multimodal LLMs' ability to comprehend graph structures using visual representations, highlighting their strengths and limitations.
Findings
Multimodal models outperform text-only models on graph comprehension tasks.
Visual graph representations enhance understanding at node, edge, and graph levels.
Limitations include challenges in interpreting complex visualizations.
Abstract
Large Language Models (LLMs) have shown remarkable capabilities in processing various data structures, including graphs. While previous research has focused on developing textual encoding methods for graph representation, the emergence of multimodal LLMs presents a new frontier for graph comprehension. These advanced models, capable of processing both text and images, offer potential improvements in graph understanding by incorporating visual representations alongside traditional textual data. This study investigates the impact of graph visualisations on LLM performance across a range of benchmark tasks at node, edge, and graph levels. Our experiments compare the effectiveness of multimodal approaches against purely textual graph representations. The results provide valuable insights into both the potential and limitations of leveraging visual graph modalities to enhance LLMs' graph…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Advanced Graph Neural Networks
