QChunker: Learning Question-Aware Text Chunking for Domain RAG via Multi-Agent Debate

Jihao Zhao; Daixuan Li; Pengfei Li; Shuaishuai Zu; Biao Qin; Hongyan Liu

arXiv:2603.11650·cs.CL·March 13, 2026

QChunker: Learning Question-Aware Text Chunking for Domain RAG via Multi-Agent Debate

Jihao Zhao, Daixuan Li, Pengfei Li, Shuaishuai Zu, Biao Qin, Hongyan Liu

PDF

Open Access

TL;DR

QChunker introduces a multi-agent debate framework to improve text chunking for RAG systems, enhancing semantic integrity and information richness through question-driven segmentation and a novel evaluation metric.

Contribution

The paper presents a new multi-agent debate approach for question-aware text chunking and introduces ChunkScore, a direct evaluation metric for chunk quality, improving RAG performance.

Findings

01

QChunker produces more coherent and information-rich text chunks.

02

ChunkScore effectively evaluates chunk quality directly.

03

Experimental results show improved RAG performance across domains.

Abstract

The effectiveness upper bound of retrieval-augmented generation (RAG) is fundamentally constrained by the semantic integrity and information granularity of text chunks in its knowledge base. To address these challenges, this paper proposes QChunker, which restructures the RAG paradigm from retrieval-augmentation to understanding-retrieval-augmentation. Firstly, QChunker models the text chunking as a composite task of text segmentation and knowledge completion to ensure the logical coherence and integrity of text chunks. Drawing inspiration from Hal Gregersen's "Questions Are the Answer" theory, we design a multi-agent debate framework comprising four specialized components: a question outline generator, text segmenter, integrity reviewer, and knowledge completer. This framework operates on the principle that questions serve as catalysts for profound insights. Through this pipeline, we…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Multimodal Machine Learning Applications