SF-RAG: Structure-Fidelity Retrieval-Augmented Generation for Academic Question Answering

Rui Yu; Tianyi Wang; Ruixia Liu; Yinglong Wang

arXiv:2602.13647·cs.IR·March 20, 2026

SF-RAG: Structure-Fidelity Retrieval-Augmented Generation for Academic Question Answering

Rui Yu, Tianyi Wang, Ruixia Liu, Yinglong Wang

PDF

Open Access

TL;DR

SF-RAG introduces a hierarchical, structure-aware retrieval framework for academic question answering, significantly reducing fragmentation and improving evidence accuracy by leveraging native paper structures.

Contribution

The paper proposes SF-RAG, a novel retrieval method that maintains the native hierarchical structure of papers, enhancing retrieval coherence and evidence allocation in academic QA.

Findings

01

Reduces retrieval fragmentation significantly.

02

Improves evidence allocation accuracy.

03

Enhances answer quality in QA benchmarks.

Abstract

Efficient question-answering (QA) over extensive scientific literature is essential for evidence-based engineering decision-making. Retrieval-augmented generation (RAG) is increasingly applied to question-answering over long academic papers, where accurate evidence allocation under a fixed token budget is critical. However, existing approaches flatten papers into unstructured chunks, destroying the native hierarchical structure and forcing retrieval to operate in a disordered space. This produces fragmented contexts, misallocates tokens to non-evidential regions, and increases the reasoning burden for downstream language models.To address these issues, we propose SF-RAG, an RAG framework that treats the native hierarchical structure of academic papers as a low-entropy retrieval prior.SF-RAG first inherits the native hierarchy to construct a structure-fidelity index, which prevents…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Biomedical Text Mining and Ontologies · Information Retrieval and Search Behavior