MacRAG: Compress, Slice, and Scale-up for Multi-Scale Adaptive Context RAG
Woosang Lim, Zekun Li, Gyuwan Kim, Sungyoung Ji, HyeonJung Kim, Kyuri Choi, Jin Hyuk Lim, Kyungpyo Park, William Yang Wang

TL;DR
MacRAG introduces a hierarchical, multi-scale retrieval framework that enhances long-context reasoning in large language models by adaptively merging relevant document segments, improving retrieval precision and context coverage.
Contribution
We propose MacRAG, a novel hierarchical RAG system that compresses and partitions documents into multiple granularities, enabling adaptive context expansion for better long-context reasoning.
Findings
Outperforms baseline RAG in multi-hop question answering tasks.
Effective long-context reasoning with Llama-3.1-8B, Gemini-1.5-pro, and GPT-4o.
Scalable and efficient for real-world applications.
Abstract
Long-context large language models (LC LLMs) combined with retrieval-augmented generation (RAG) hold strong potential for complex multi-hop and large-document tasks. However, existing RAG systems often suffer from imprecise retrieval, incomplete context coverage under constrained windows, and fragmented information from suboptimal context construction. We introduce Multi-scale Adaptive Context RAG (MacRAG), a hierarchical RAG framework that compresses and partitions documents into coarse-to-fine granularities, then adaptively merges relevant contexts through real-time chunk- and document-level expansions. By initiating with finest-level retrieval and progressively incorporating broader, higher-level context, MacRAG constructs effective query-specific long contexts, optimizing both precision and coverage. Evaluations on challenging LongBench expansions of HotpotQA, 2WikiMultihopQA, and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Multimodal Machine Learning Applications · Machine Learning in Healthcare
MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Attention Is All You Need · Linear Warmup With Linear Decay · Dropout · Layer Normalization · Byte Pair Encoding · Attention Dropout · Softmax · Residual Connection · WordPiece
