MULTIBENCH++: A Unified and Comprehensive Multimodal Fusion Benchmarking Across Specialized Domains

Leyan Xue; Changqing Zhang; Kecheng Xue; Xiaohong Liu; Guangyu Wang; Zongbo Han

arXiv:2511.06452·cs.LG·May 7, 2026

MULTIBENCH++: A Unified and Comprehensive Multimodal Fusion Benchmarking Across Specialized Domains

Leyan Xue, Changqing Zhang, Kecheng Xue, Xiaohong Liu, Guangyu Wang, Zongbo Han

PDF

1 Video

TL;DR

This paper introduces MULTIBENCH++, a large-scale, unified benchmark with over 30 datasets across 15 modalities for evaluating multimodal fusion methods, addressing evaluation biases and fostering fair comparisons.

Contribution

It presents a comprehensive, domain-adaptive benchmark and an open-source evaluation pipeline, enabling rigorous, reproducible assessment of multimodal fusion models.

Findings

01

Established new performance baselines across multiple tasks.

02

Demonstrated the effectiveness of the benchmark in evaluating diverse fusion methods.

Abstract

Although multimodal fusion has made significant progress, its advancement is severely hindered by the lack of adequate evaluation benchmarks. Current fusion methods are typically evaluated on a small selection of public datasets, a limited scope that inadequately represents the complexity and diversity of real-world scenarios, potentially leading to biased evaluations. This issue presents a twofold challenge. On one hand, models may overfit to the biases of specific datasets, hindering their generalization to broader practical applications. On the other hand, the absence of a unified evaluation standard makes fair and objective comparisons between different fusion methods difficult. Consequently, a truly universal and high-performance fusion model has yet to emerge. To address these challenges, we have developed a large-scale, domain-adaptive benchmark for multimodal evaluation. This…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

MULTIBENCH++: A Unified and Comprehensive Multimodal Fusion Benchmarking Across Specialized Domains· underline