M-RAG: Making RAG Faster, Stronger, and More Efficient
Sun Xu, Tongkai Xu, Baiheng Xie, Li Huang, Qiang Gao, Kunpeng Zhang

TL;DR
M-RAG introduces a chunk-free retrieval method using structured meta-markers, improving efficiency and relevance in retrieval-augmented generation for large language models.
Contribution
It proposes a novel chunk-free retrieval strategy that enhances retrieval relevance and efficiency by using structured meta-markers, outperforming traditional chunk-based RAG methods.
Findings
M-RAG outperforms chunk-based RAG on LongBench subtasks.
It retrieves more answer-friendly evidence with higher efficiency.
M-RAG is effective under low-resource settings.
Abstract
Retrieval-Augmented Generation (RAG) has become a widely adopted paradigm for enhancing the reliability of large language models (LLMs). However, RAG systems are sensitive to retrieval strategies that rely on text chunking to construct retrieval units, which often introduce information fragmentation, retrieval noise, and reduced efficiency. Recent work has even questioned the necessity of RAG, arguing that long-context LLMs may eliminate multi-stage retrieval pipelines by directly processing full documents. Nevertheless, expanded context capacity alone does not resolve the challenges of relevance filtering, evidence prioritization, and isolating answer-bearing information. To this end, we proposed M-RAG, a novel Chunk-free retrieval strategy. Instead of retrieving coarse-grained textual chunks, M-RAG extracts structured, k-v decomposition meta-markers, with a lightweight, intent-aligned…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
