MIRAGE: Scaling Test-Time Inference with Parallel Graph-Retrieval-Augmented Reasoning Chains

Kaiwen Wei; Rui Shan; Dongsheng Zou; Jianzhong Yang; Bi Zhao; Junnan Zhu; Jiang Zhong

arXiv:2508.18260·cs.CL·January 21, 2026

MIRAGE: Scaling Test-Time Inference with Parallel Graph-Retrieval-Augmented Reasoning Chains

Kaiwen Wei, Rui Shan, Dongsheng Zou, Jianzhong Yang, Bi Zhao, Junnan Zhu, Jiang Zhong

PDF

1 Video

TL;DR

MIRAGE is a scalable reasoning framework that enhances medical question-answering by executing parallel, structured inference chains over knowledge graphs, improving accuracy, traceability, and interpretability.

Contribution

It introduces a novel multi-chain inference approach over knowledge graphs, addressing error propagation and enhancing interpretability in medical reasoning tasks.

Findings

01

Outperforms GPT-4o, Tree-of-Thought, and other baselines in medical QA

02

Improves interpretability with explicit reasoning chains

03

Enhances accuracy and traceability in complex medical reasoning

Abstract

Large reasoning models (LRMs) have shown significant progress in test-time scaling through chain-of-thought prompting. Current approaches like search-o1 integrate retrieval augmented generation (RAG) into multi-step reasoning processes but rely on a single, linear reasoning chain while incorporating unstructured textual information in a flat, context-agnostic manner. As a result, these approaches can lead to error accumulation throughout the reasoning chain, which significantly limits its effectiveness in medical question-answering (QA) tasks where both accuracy and traceability are critical requirements. To address these challenges, we propose MIRAGE (Multi-chain Inference with Retrieval-Augmented Graph Exploration), a novel test-time scalable reasoning framework that performs dynamic multi-chain inference over structured medical knowledge graphs. Specifically, MIRAGE 1) decomposes…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

MIRAGE: Scaling Test-Time Inference with Parallel Graph-Retrieval-Augmented Reasoning Chains· underline