# GOSU: Retrieval-Augmented Generation with Global-Level Optimized Semantic Unit-Centric Framework

**Authors:** Xuecheng Zou, Ke Liu, Bingbing Wang, Huafei Deng, Li Zhang, Yu Tang

arXiv: 2509.00449 · 2025-09-03

## TL;DR

GOSU introduces a global-level semantic unit-centric framework that enhances retrieval-augmented generation by capturing interconnections across global context, improving generation quality over traditional RAG methods.

## Contribution

The paper proposes GOSU, a novel framework that performs global disambiguation and captures interconnections between semantic units across text chunks, addressing limitations of local extraction methods.

## Key findings

- GOSU outperforms baseline RAG methods in multiple tasks.
- Hierarchical keyword extraction improves fine-grained relationship uncovering.
- Semantic unit completion compensates for missing relationships.

## Abstract

Building upon the standard graph-based Retrieval-Augmented Generation (RAG), the introduction of heterogeneous graphs and hypergraphs aims to enrich retrieval and generation by leveraging the relationships between multiple entities through the concept of semantic units (SUs). But this also raises a key issue: The extraction of high-level SUs limited to local text chunks is prone to ambiguity, complex coupling, and increased retrieval overhead due to the lack of global knowledge or the neglect of fine-grained relationships. To address these issues, we propose GOSU, a semantic unit-centric RAG framework that efficiently performs global disambiguation and utilizes SUs to capture interconnections between different nodes across the global context. In the graph construction phase, GOSU performs global merging on the pre-extracted SUs from local text chunks and guides entity and relationship extraction, reducing the difficulty of coreference resolution while uncovering global semantic objects across text chunks. In the retrieval and generation phase, we introduce hierarchical keyword extraction and semantic unit completion. The former uncovers the fine-grained binary relationships overlooked by the latter, while the latter compensates for the coarse-grained n-ary relationships missing from the former. Evaluation across multiple tasks demonstrates that GOSU outperforms the baseline RAG methods in terms of generation quality.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/2509.00449/full.md

## Figures

3 figures with captions in the complete paper: https://tomesphere.com/paper/2509.00449/full.md

## References

87 references — full list in the complete paper: https://tomesphere.com/paper/2509.00449/full.md

---
Source: https://tomesphere.com/paper/2509.00449