MMR-Bench: A Comprehensive Benchmark for Multimodal LLM Routing

Haoxuan Ma; Guannan Lai; Han-Jia Ye

arXiv:2601.17814·cs.AI·January 27, 2026

MMR-Bench: A Comprehensive Benchmark for Multimodal LLM Routing

Haoxuan Ma, Guannan Lai, Han-Jia Ye

PDF

Open Access

TL;DR

MMR-Bench introduces a comprehensive benchmark for evaluating and improving model routing strategies in multimodal large language models, optimizing the trade-off between accuracy and computational cost across diverse tasks.

Contribution

It provides a standardized, modality-aware benchmarking environment for routing in MLLMs, enabling fair comparison and development of cost-effective, accurate model selection policies.

Findings

01

Incorporating multimodal signals enhances routing quality.

02

Routing policies can surpass single-model accuracy at lower costs.

03

Policies trained on limited data generalize well to new datasets.

Abstract

Multimodal large language models (MLLMs) have advanced rapidly, yet heterogeneity in architecture, alignment strategies, and efficiency means that no single model is uniformly superior across tasks. In practical deployments, workloads span lightweight OCR to complex multimodal reasoning; using one MLLM for all queries either over-provisions compute on easy instances or sacrifices accuracy on hard ones. Query-level model selection (routing) addresses this tension, but extending routing from text-only LLMs to MLLMs is nontrivial due to modality fusion, wide variation in computational cost across models, and the absence of a standardized, budget-aware evaluation. We present MMR-Bench, a unified benchmark that isolates the multimodal routing problem and enables comparison under fixed candidate sets and cost models. MMR-Bench provides (i) a controlled environment with modality-aware inputs…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Natural Language Processing Techniques · Topic Modeling