SF-RAG: Structure-Fidelity Retrieval-Augmented Generation for Academic Question Answering
Rui Yu, Tianyi Wang, Ruixia Liu, Yinglong Wang

TL;DR
SF-RAG introduces a hierarchical, structure-aware retrieval framework for academic question answering, significantly reducing fragmentation and improving evidence accuracy by leveraging native paper structures.
Contribution
The paper proposes SF-RAG, a novel retrieval method that maintains the native hierarchical structure of papers, enhancing retrieval coherence and evidence allocation in academic QA.
Findings
Reduces retrieval fragmentation significantly.
Improves evidence allocation accuracy.
Enhances answer quality in QA benchmarks.
Abstract
Efficient question-answering (QA) over extensive scientific literature is essential for evidence-based engineering decision-making. Retrieval-augmented generation (RAG) is increasingly applied to question-answering over long academic papers, where accurate evidence allocation under a fixed token budget is critical. However, existing approaches flatten papers into unstructured chunks, destroying the native hierarchical structure and forcing retrieval to operate in a disordered space. This produces fragmented contexts, misallocates tokens to non-evidential regions, and increases the reasoning burden for downstream language models.To address these issues, we propose SF-RAG, an RAG framework that treats the native hierarchical structure of academic papers as a low-entropy retrieval prior.SF-RAG first inherits the native hierarchy to construct a structure-fidelity index, which prevents…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Biomedical Text Mining and Ontologies · Information Retrieval and Search Behavior
