MiRAGE: A Multiagent Framework for Generating Multimodal Multihop Question-Answer Dataset for RAG Evaluation
Chandan Kumar Sahu, Premith Kumar Chilukuri, Matthew Hetrich

TL;DR
MiRAGE is a multiagent framework that generates complex, multimodal, multi-hop question-answer datasets for RAG system evaluation, addressing the lack of domain-specific benchmarks with higher reasoning complexity and factual accuracy.
Contribution
Introduces MiRAGE, a novel multiagent system that automates the creation of domain-specific, multimodal, multi-hop QA datasets for RAG evaluation, improving over existing benchmarks.
Findings
MiRAGE produces datasets with over 2.3 reasoning hops on average.
Datasets exhibit higher factual faithfulness and reasoning complexity.
Effective even with LLMs when image descriptions are available.
Abstract
The rapid evolution of Retrieval-Augmented Generation (RAG) toward multimodal, high-stakes enterprise applications has outpaced the development of domain specific evaluation benchmarks. Existing datasets often rely on general-domain corpora or purely textual retrieval, failing to capture the complexity of specialized technical documents where information is inextricably multimodal and reasoning requires synthesizing disjoint evidence. We address this gap by introducing MiRAGE, a Multiagent framework for RAG systems Evaluation, that leverages a collaborative swarm of specialized agents to generate verified, domain-specific, multimodal, and multi-hop Question-Answer datasets. MiRAGE orchestrates a swarm of specialized agents: a recursive context optimization loop to aggregate scattered evidence, an adversarial verifier agent to guarantee factual grounding, and an agent to recognize the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Topic Modeling · Advanced Graph Neural Networks
