MMRel: Benchmarking Relation Understanding in Multi-Modal Large Language Models

Jiahao Nie; Gongjie Zhang; Wenbin An; Yun Xing; Yap-Peng Tan; Alex C. Kot; Shijian Lu

arXiv:2406.09121·cs.CV·December 19, 2025

MMRel: Benchmarking Relation Understanding in Multi-Modal Large Language Models

Jiahao Nie, Gongjie Zhang, Wenbin An, Yun Xing, Yap-Peng Tan, Alex C. Kot, Shijian Lu

PDF

Open Access 1 Repo 1 Datasets

TL;DR

This paper introduces MMRel, a large-scale, high-quality benchmark dataset designed to evaluate and improve relation understanding in Multi-modal Large Language Models, addressing current limitations in inter-object relation comprehension.

Contribution

The paper presents MMRel, a comprehensive benchmark with diverse, high-quality relation data, and demonstrates its effectiveness in evaluating and enhancing MLLMs' relation understanding capabilities.

Findings

01

MMRel improves evaluation accuracy for MLLMs on relation tasks.

02

Fine-tuning MLLMs with MMRel enhances their relation comprehension.

03

Extensive experiments validate MMRel's utility across 28 MLLMs.

Abstract

Though Multi-modal Large Language Models (MLLMs) have recently achieved significant progress, they often struggle to understand diverse and complicated inter-object relations. Specifically, the lack of large-scale and high-quality relation data has greatly hindered the progress of MLLMs in various vision-language perception tasks. We attempt to address this challenge by contributing the Multi-Modal Relation Understanding benchmark (MMRel), which features large-scale, high-quality, and diverse data on inter-object relations. MMRel has three distinctive attributes: (i) it contains 22,500 question-answer pairs spanning three distinct domains and around 400 relations, ensuring both scale and diversity; (ii) it provides manually verified, high-quality labels to ensure exceptional annotation accuracy; and (iii) it includes adversarial cases with highly unusual relations, offering a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

niejiahao1998/mmrel
noneOfficial

Datasets

jiahaonie/MMRel
dataset· 223 dl
223 dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSemantic Web and Ontologies