GOSU: Retrieval-Augmented Generation with Global-Level Optimized Semantic Unit-Centric Framework

Xuecheng Zou; Ke Liu; Bingbing Wang; Huafei Deng; Li Zhang; Yu Tang

arXiv:2509.00449·cs.CL·September 3, 2025

GOSU: Retrieval-Augmented Generation with Global-Level Optimized Semantic Unit-Centric Framework

Xuecheng Zou, Ke Liu, Bingbing Wang, Huafei Deng, Li Zhang, Yu Tang

PDF

Open Access

TL;DR

GOSU introduces a global-level semantic unit-centric framework that enhances retrieval-augmented generation by capturing interconnections across global context, improving generation quality over traditional RAG methods.

Contribution

The paper proposes GOSU, a novel framework that performs global disambiguation and captures interconnections between semantic units across text chunks, addressing limitations of local extraction methods.

Findings

01

GOSU outperforms baseline RAG methods in multiple tasks.

02

Hierarchical keyword extraction improves fine-grained relationship uncovering.

03

Semantic unit completion compensates for missing relationships.

Abstract

Building upon the standard graph-based Retrieval-Augmented Generation (RAG), the introduction of heterogeneous graphs and hypergraphs aims to enrich retrieval and generation by leveraging the relationships between multiple entities through the concept of semantic units (SUs). But this also raises a key issue: The extraction of high-level SUs limited to local text chunks is prone to ambiguity, complex coupling, and increased retrieval overhead due to the lack of global knowledge or the neglect of fine-grained relationships. To address these issues, we propose GOSU, a semantic unit-centric RAG framework that efficiently performs global disambiguation and utilizes SUs to capture interconnections between different nodes across the global context. In the graph construction phase, GOSU performs global merging on the pre-extracted SUs from local text chunks and guides entity and relationship…

Tables2

Table 1. Table 1: Win rates (%) of GOSU v.s. baselines across five datasets and four evaluation dimensions.

	Agriculture		CS		Hypertension		Legal		Mix
	NaiveRAG	GOSU	NaiveRAG	GOSU	NaiveRAG	GOSU	NaiveRAG	GOSU	NaiveRAG	GOSU	Avg gap
Comprehensiveness	12.0%	88.0%	14.3%	85.7%	11.1%	88.9%	12.0%	88.0%	23.1%	76.9%	+71.0%
Diversity	26.8%	73.2%	29.7%	70.3%	24.4%	75.6%	16.7%	83.3%	22.9%	77.1%	+51.8%
Empowerment	11.5%	88.5%	12.0%	88.0%	18.2%	81.8%	15.9%	84.1%	20.0%	80.0%	+69.0%
Overall	10.0%	90.0%	12.5%	87.5%	11.2%	88.5%	10.0%	90.0%	22.7%	77.3%	+73.4%
	LightRAG	GOSU	LightRAG	GOSU	LightRAG	GOSU	LightRAG	GOSU	LightRAG	GOSU	Avg gap
Comprehensiveness	8.8%	91.2%	20.0%	80.0%	25.0%	75.0%	29.4%	70.6%	41.7%	58.3%	+50.0%
Diversity	23.6%	76.4%	43.5%	56.5%	39.0%	61.0%	47.9%	52.1%	48.8%	51.2%	+18.9%
Empowerment	6.8%	93.2%	23.1%	76.9%	29.5%	70.5%	35.3%	64.7%	29.6%	70.4%	+50.3%
Overall	5.8%	94.2%	20.8%	79.2%	22.2%	77.8%	34.5%	65.5%	31.2%	68.8%	+54.2%
	HiRAG	GOSU	HiRAG	GOSU	HiRAG	GOSU	HiRAG	GOSU	HiRAG	GOSU	Avg gap
Comprehensiveness	14.3%	85.7%	42.9%	57.1%	20.0%	80.0%	33.3%	66.7%	33.3%	66.7%	+42.5%
Diversity	36.8%	63.2%	46.1%	53.9%	47.3%	52.7%	48.9%	51.1%	48.2%	51.8%	+9.1%
Empowerment	32.0%	68.0%	39.3%	60.7%	34.1%	65.9%	43.3%	56.7%	48.0%	52.0%	+21.3%
Overall	23.8%	76.2%	39.1%	60.9%	29.0%	71.0%	41.3%	58.7%	45.5%	54.5%	+28.5%
	HyperGraphRAG	GOSU	HyperGraphRAG	GOSU	HyperGraphRAG	GOSU	HyperGraphRAG	GOSU	HyperGraphRAG	GOSU	Avg gap
Comprehensiveness	5.9%	94.1%	3.7%	96.3%	6.5%	93.5%	5.6%	94.4%	7.7%	92.3%	+88.2%
Diversity	26.2%	73.8%	14.6%	85.4%	29.8%	70.2%	17.9%	82.1%	21.5%	78.5%	+56.0%
Empowerment	7.9%	92.1%	12.0%	88.0%	27.5%	72.5%	21.4%	78.6%	5.0%	95.0%	+70.5%
Overall	3.4%	96.6%	12.5%	87.5%	20.8%	79.2%	7.9%	83.3%	5.6%	94.4%	+78.2%

Table 2. Table 2: Win rates (%) of GOSU vs. its ablated versions across five datasets and four evaluation dimensions.

	Construction
	Agriculture		CS		Hypertension		Legal		Mix
	w/o GO	GOSU	w/o GO	GOSU	w/o GO	GOSU	w/o GO	GOSU	w/o GO	GOSU	Avg gap
Comprehensiveness	42.9%	57.1%	41.7%	58.3%	25.0%	75.0%	14.3%	85.7%	20.0%	80.0%	+42.4%
Diversity	46.7%	53.3%	48.7%	51.3%	48.1%	51.9%	40.3%	59.7%	49.4%	50.6%	+6.7%
Empowerment	46.4%	53.6%	42.1%	57.9%	42.9%	57.1%	22.7%	77.3%	40.9%	59.1%	+22.0%
Overall	41.7%	58.3%	41.2%	58.8%	32.4%	67.6%	15.8%	84.2%	37.5%	62.5%	+32.6%
	Retrieval & Generation
	Agriculture		CS		Hypertension		Legal		Mix
	w/o EL	GOSU	w/o EL	GOSU	w/o EL	GOSU	w/o EL	GOSU	w/o EL	GOSU	Avg gap
Comprehensiveness	46.7%	53.3%	38.5%	61.5%	47.1%	52.9%	37.5%	62.5%	46.2%	53.8%	+13.6%
Diversity	45.4%	54.6%	41.6%	58.4%	48.3%	51.7%	45.5%	54.5%	43.5%	56.5%	+10.3%
Empowerment	44.4%	55.6%	36.0%	64.0%	44.6%	55.4%	47.4%	52.6%	43.5%	56.5%	+13.6%
Overall	45.2%	54.8%	42.9%	57.1%	46.9%	53.1%	47.1%	52.9%	45.0%	55.0%	+9.2%
	w/o RL	GOSU	w/o RL	GOSU	w/o RL	GOSU	w/o RL	GOSU	w/o RL	GOSU	Avg gap
Comprehensiveness	45.5%	54.5%	44.4%	55.6%	33.3%	66.7%	36.4%	63.6%	47.1%	52.9%	+17.3%
Diversity	44.9%	55.1%	46.3%	53.7%	40.5%	59.5%	45.8%	54.2%	39.2%	60.8%	+13.3%
Empowerment	45.7%	54.3%	40.0%	60.0%	37.3%	62.7%	48.0%	52.0%	41.2%	58.8%	+15.1%
Overall	45.8%	54.2%	46.7%	53.3%	35.1%	64.9%	45.0%	55.0%	40.9%	59.1%	+14.6%
	w/o EL & RL	GOSU	w/o EL & RL	GOSU	w/o EL & RL	GOSU	w/o EL & RL	GOSU	w/o EL & RL	GOSU	Avg gap
Comprehensiveness	33.3%	66.7%	35.7%	64.3%	31.2%	68.8%	43.7%	56.3%	36.8%	63.2%	+27.7%
Diversity	32.3%	67.7%	38.7%	61.3%	31.7%	68.3%	46.3%	53.7%	34.7%	65.3%	+26.5%
Empowerment	30.6%	69.4%	37.5%	62.5%	43.1%	56.9%	48.1%	51.9%	37.9%	62.1%	+21.1%
Overall	35.7%	64.3%	27.8%	72.2%	42.4%	57.6%	45.8%	54.2%	33.3%	66.7%	+26.0%
	w/o SL	GOSU	w/o SL	GOSU	w/o SL	GOSU	w/o SL	GOSU	w/o SL	GOSU	Avg gap
Comprehensiveness	45.5%	54.5%	22.2%	77.8%	35.3%	64.7%	42.9%	57.1%	18.2%	81.8%	+34.4%
Diversity	38.3%	61.7%	43.4%	56.6%	44.8%	55.2%	42.2%	57.8%	39.2%	60.8%	+16.8%
Empowerment	38.1%	61.9%	32.0%	68.0%	38.5%	61.5%	34.8%	65.2%	36.0%	64.0%	+28.2%
Overall	38.9%	61.1%	25.0%	75.0%	39.4%	60.6%	36.8%	63.2%	38.1%	61.9%	+28.7%

Equations60

C = DocSplit (D),

C = DocSplit (D),

s.t. \forall i \neq = j, c_{i} \cap c_{j} \neq = \emptyset \land i = 1 ⋃ N c_{i} = D .

s.t. \forall i \neq = j, c_{i} \cap c_{j} \neq = \emptyset \land i = 1 ⋃ N c_{i} = D .

S_{i}^{*} = {s_{i, 1}^{*}, s_{i, 2}^{*}, \dots, s_{i, k_{i}}^{*}} \sim SemExt_{LLM} (c_{i}),

S_{i}^{*} = {s_{i, 1}^{*}, s_{i, 2}^{*}, \dots, s_{i, k_{i}}^{*}} \sim SemExt_{LLM} (c_{i}),

S^{*} = i = 1 ⋃ N S_{i}^{*} .

S^{*} = i = 1 ⋃ N S_{i}^{*} .

Coarse (S^{*})

Coarse (S^{*})

\displaystyle\qquad\;\mathrm{SimJudge}(s_{i}^{*},s_{j}^{*};\tau)=1\bigr{\}},

Fine (S^{*})

Fine (S^{*})

\displaystyle\qquad\;\mathrm{LLMJudge}(s_{i}^{*},s_{j}^{*};\tau)=1\bigr{\}},

\hat{\mathcal{S}}=\mathrm{Deduplicate}\!\circ\!\mathrm{Cluster}\!\bigl{(}\mathrm{Fine}(\mathcal{S}^{*})\bigr{)}.

\hat{\mathcal{S}}=\mathrm{Deduplicate}\!\circ\!\mathrm{Cluster}\!\bigl{(}\mathrm{Fine}(\mathcal{S}^{*})\bigr{)}.

Retriever_{id} (\overset{s}{^}_{i})

Retriever_{id} (\overset{s}{^}_{i})

\displaystyle\quad ID(c_{j})\cap ChunkID(\hat{s}_{i})\neq\emptyset\bigr{\}},

Retriever_{sim} (\overset{s}{^}_{i})

Retriever_{sim} (\overset{s}{^}_{i})

c_{j} \in / Retriever_{id} (\overset{s}{^}_{i}),

\displaystyle\quad c_{j}\in\mathrm{SimRank}_{\mathcal{C}}(\hat{s}_{i})\bigr{\}}.

\begin{split}&\mathrm{Trim}\!\left(\mathrm{Retriever}_{\text{id}}(\hat{s}_{i})\cup\mathrm{Retriever}_{\text{sim}}(\hat{s}_{i})\right)\;\leq\\[2.0pt] &\qquad\max\!\bigl{(}\tau,\,\mathrm{Len}(\mathrm{Retriever}_{\text{id}}(\hat{s}_{i}))\bigr{)},\end{split}

\begin{split}&\mathrm{Trim}\!\left(\mathrm{Retriever}_{\text{id}}(\hat{s}_{i})\cup\mathrm{Retriever}_{\text{sim}}(\hat{s}_{i})\right)\;\leq\\[2.0pt] &\qquad\max\!\bigl{(}\tau,\,\mathrm{Len}(\mathrm{Retriever}_{\text{id}}(\hat{s}_{i}))\bigr{)},\end{split}

S

S

\displaystyle\qquad\quad\mathrm{Retriever}_{\text{sim}}(\hat{s}_{i})\Bigr{)},

P r e - K G = c_{i} \in C ⋃ EntRelExt (c_{i}),

P r e - K G = c_{i} \in C ⋃ EntRelExt (c_{i}),

E_{i} = ID (c_{j}) \in ChunkID (s_{i}) ⋃ EntExt_{sem} (s_{i}, c_{j}),

E_{i} = ID (c_{j}) \in ChunkID (s_{i}) ⋃ EntExt_{sem} (s_{i}, c_{j}),

R_{i} = ID (c_{j}) \in ChunkID (s_{i}) ⋃ RelExt_{sem} (s_{i}, c_{j}),

R_{i} = ID (c_{j}) \in ChunkID (s_{i}) ⋃ RelExt_{sem} (s_{i}, c_{j}),

G_{i}^{*} = Deduplicate \circ Link_{sub} (s_{i}, E_{i}, R_{i}),

G_{i}^{*} = Deduplicate \circ Link_{sub} (s_{i}, E_{i}, R_{i}),

G = Deduplicate \circ Link_{all} (G^{*}),

G = Deduplicate \circ Link_{all} (G^{*}),

(K_{l o w}, K_{se m}, K_{hi g h})

(K_{l o w}, K_{se m}, K_{hi g h})

(G_{low}, C_{low}, S_{low})

(G_{low}, C_{low}, S_{low})

\displaystyle\quad\mathcal{G},\,\mathcal{C},\,\mathcal{S}\bigr{)},

(G_{high}, C_{high}, S_{high})

(G_{high}, C_{high}, S_{high})

\displaystyle\quad\mathcal{G},\,\mathcal{C},\,\mathcal{S}\bigr{)},

\mathcal{S}_{\text{sem}}=\mathrm{Retriever}_{\text{sem}}\!\bigl{(}\mathcal{K}_{\text{sem}};\,\mathcal{S}\bigr{)},

\mathcal{S}_{\text{sem}}=\mathrm{Retriever}_{\text{sem}}\!\bigl{(}\mathcal{K}_{\text{sem}};\,\mathcal{S}\bigr{)},

S_{all}

S_{all}

\displaystyle\mathrm{Len}\!\bigl{(}\mathcal{S}_{\text{all}}\bigr{)}

\displaystyle\mathrm{Len}\!\bigl{(}\mathcal{S}_{\text{all}}\bigr{)}

(G_{sem}, C_{sem})

(G_{sem}, C_{sem})

G_{all}

G_{all}

C_{all}

C_{all}

R

R

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Information Retrieval and Search Behavior · Advanced Graph Neural Networks

Full text

GOSU: Retrieval-Augmented Generation with Global-Level Optimized Semantic Unit-Centric Framework

Xuecheng Zou $\spadesuit$ Ke Liu $\diamondsuit$ **Bingbing Wang $\heartsuit$

Huafei Deng $\spadesuit$ ** Li Zhang $\spadesuit$ Yu Tang $\spadesuit$

$\spadesuit$ School of Future Science and Engineering, Soochow University

$\heartsuit$ School of Mathematical Sciences, Soochow University

$\diamondsuit$ School of Information Technology (Smart Campus Education Center),

Suzhou Institute of Trade & Commerce

{xczouxczou, bbwangstat1, 20245258046, 20245258048}@stu.suda.edu.cn

[email protected] [email protected] Corresponding author: [email protected].

Abstract

Building upon the standard graph-based Retrieval-Augmented Generation (RAG), the introduction of heterogeneous graphs and hypergraphs aims to enrich retrieval and generation by leveraging the relationships between multiple entities through the concept of semantic units (SUs). But this also raises a key issue: The extraction of high-level SUs limited to local text chunks is prone to ambiguity, complex coupling, and increased retrieval overhead due to the lack of global knowledge or the neglect of fine-grained relationships. To address these issues, we propose GOSU, a semantic unit-centric RAG framework that efficiently performs global disambiguation and utilizes SUs to capture interconnections between different nodes across the global context. In the graph construction phase, GOSU performs global merging on the pre-extracted SUs from local text chunks and guides entity and relationship extraction, reducing the difficulty of coreference resolution while uncovering global semantic objects across text chunks. In the retrieval and generation phase, we introduce hierarchical keyword extraction and semantic unit completion. The former uncovers the fine-grained binary relationships overlooked by the latter, while the latter compensates for the coarse-grained $n$ -ary relationships missing from the former. Evaluation across multiple tasks demonstrates that GOSU outperforms the baseline RAG methods in terms of generation quality. Our code is available at https://github.com/xczouxczou/GOSU.

GOSU: Retrieval-Augmented Generation with Global-Level Optimized Semantic Unit-Centric Framework

** Xuecheng Zou $\spadesuit$ Ke Liu $\diamondsuit$ Bingbing Wang $\heartsuit$ **

Huafei Deng $\spadesuit$ Li Zhang $\spadesuit$ Yu Tang $\spadesuit$ ††thanks: Corresponding author: [email protected].

$\spadesuit$ School of Future Science and Engineering, Soochow University

$\heartsuit$ School of Mathematical Sciences, Soochow University

$\diamondsuit$ School of Information Technology (Smart Campus Education Center),

Suzhou Institute of Trade & Commerce

{xczouxczou, bbwangstat1, 20245258046, 20245258048}@stu.suda.edu.cn

[email protected] [email protected]

1 Introduction

With the explosion of data scale (Ouyang et al., 2022), the performance of large language model (LLM) is improving by leaps and bounds (OpenAI et al., 2023; Touvron et al., 2023; Mei et al., 2025), yet their finite parameters still lead to frequent hallucinations (Mallen et al., 2022; Min et al., 2023; Ji et al., 2022; Huang et al., 2023). To this end, Retrieval-Augmented Generation (RAG) (Lewis et al., 2020; Gao et al., 2023b; Fan et al., 2024; Hu et al., 2025; Asai et al., 2023), which integrates external knowledge sources to enhance factual consistency and generation accuracy (Sudhi et al., 2024; Es et al., 2024; Salemi and Zamani, 2024; Zhao et al., 2023; Tu et al., 2024; Tonmoy et al., 2024; Shrestha et al., 2024; Liu et al., 2023), has emerged as a promising solution. In standard RAG methods, the simple approach of processing fixed-length text chunks often fails to effectively capture direct or indirect relationships between entities, limiting its practicality in knowledge-intensive tasks (Pan et al., 2023; Luo et al., 2023; Wang et al., 2024b; Han et al., 2024; Wen et al., 2023).

Recently, graph-structured RAG methods have enhanced the ability of relational representation by incorporating knowledge graphs (Edge et al., 2024; Zhang et al., 2025a; Liang et al., 2025; Guo et al., 2024; Tian et al., 2024; Park et al., 2023; Jiménez Gutiérrez et al., 2024; He et al., 2024; Trajanoska et al., 2023; Sanmartin, 2024; Wang et al., 2024b; Rampášek et al., 2022), but they are constrained by the binary relations inherent in structuring natural language into graphs, preventing them from effectively modeling $n$ -ary relations among multiple entities and thus limiting their performance on complex reasoning tasks (Wen et al., 2016). Current studies are exploring the introduction of heterogeneous graphs and hypergraphs to tackle this issue (Xu et al., 2025b; Luo et al., 2025a; Huang et al., 2025; Wang et al., 2025a; Ma et al., 2025; Mei et al., 2025). However, as illustrated in Fig. 1, decomposing events within isolated text chunks (Xu et al., 2025b) and over-emphasizing $n$ -ary relations (Luo et al., 2025a) not only leads to information fragmentation and contextual discontinuity, but also neglects the precise representation of fine-grained relations and increases the complexity of information coupling. In other words, meeting this challenge requires optimizing the entire RAG pipeline—from knowledge graph construction through retrieval and generation—by integrating global context while balancing both coarse-grained and fine-grained relations.

To address these shortcomings, we propose the GOSU framework, a RAG approach that refines semantic unit extraction at the global level and drives the entire pipeline around these semantic units (SUs). GOSU optimizes SUs at the global level through a multi-round semantic unit global merging strategy to prevent the relation fragmentation that can arise from relying on individual text chunks. Specifically, we leverage the LLM’s advanced natural language processing capabilities to identify SUs for each text block, so as to avoid the loss of critical semantic information. These identified SUs serve as pre-SUs, laying the foundation for subsequent disambiguation and deduplication merging, and ensuring semantic consistency across different text blocks. Unlike traditional graph methods based on binary relations or hyperedges, GOSU focuses on SUs—using semantic unit-centric connections during knowledge graph construction and retrieval to uncover coarse-grained $n$ -ary relationships while preserving fine-grained binary relations among low-level entities, thus avoiding excessive information coupling.

Our contributions can be summarized as follows:

•

**Global-Level Semantic Unit Optimization: **A semantic unit global merging strategy that leverages LLM is proposed to extract SUs from each text block and then performs global disambiguation, deduplication, and merging to ensure semantic consistency across chunks and avoid relationship fragmentation caused by local segmentation.

•

**Semantic Unit-Centric Knowledge Graph Construction: **Diverging from traditional binary-relation or hyperedge approaches, we center the graph around SUs. This allows us to simultaneously capture coarse-grained $n$ -ary relations and preserve fine-grained binary relations among underlying entities, achieving a balanced representation that mitigates over-coupling of information.

•

**Dual-Phase Retrieval-Augmented Generation Framework: **Hierarchical keyword extraction with SU completion in both retrieval and generation stages are integrated—where keyword extraction targets fine-grained entity/term retrieval and SU completion fills in coarse-grained multi-entity SUs. The synergistic fusion of these components significantly enhances contextual coverage and generation fidelity.

Experiments across multiple open knowledge intensive fields demonstrate that GOSU has superior performance in authenticity, comprehensiveness, diversity and empowerment (Guo et al., 2024; Qian et al., 2024), which validates that our framework provides an innovative idea for the global-level semantic unit–centric graph construction and retrieval generation paradigm, and highlights its promising potential for real-world applications.

2 Related work

2.1 Retrieval-Augmented Generation

Retrieval-Augmented Generation (RAG) grounds large language model (LLM) outputs in external corpora retrieval, which aligns the generation with trusted knowledge and reducing hallucinations (Huang et al., 2023; Niu et al., 2024; Bai et al., 2024; Bang et al., 2025) in knowledge-intensive tasks (Gao et al., 2023a). Subsequent improvements introduced joint training of passage-generation (Izacard et al., 2022) and multidimensional adaptive trade-offs (Min et al., 2021) to refine both query formulation and result filtering. Self-RAG and other variants (Trivedi et al., 2022; Shao et al., 2023; Asai et al., 2023; Yu et al., 2024) leverage the LLM itself within iterative retrieval–reasoning loops to generate follow-up queries and assess retrieved evidence, whereas DRAG (Hu et al., 2025) and FLARE (Jiang et al., 2023; Su et al., 2024) introduce multi-agent debates and token-triggered retrieval to further curb hallucination and reduce unnecessary context. Despite these advances, all of these "flat" RAG approaches rely on coarse chunking and simple retrieval strategies, which tend to fragment context and inject noise when handling nuanced multi-entity events.

2.2 Graph-Structured RAG

To better preserve relationships between entities, graph-structured RAG methods integrate knowledge graphs and graph algorithms into retrieval and generation (Kim et al., 2023; Peng et al., 2024; Xiang et al., 2025; Zhang et al., 2025b). Works such as GraphRAGG (Edge et al., 2024; Jiang et al., 2024; Mavromatis and Karypis, 2024; He et al., 2024; Zhang et al., 2025a) propagate contextual signals across retrieved text via multi-round message passing, extract entities and build binary relationship diagrams. Subsequently, KAG (Liang et al., 2025) and LightRAG (Guo et al., 2024) introduced confidence-based edge weighting mechanisms to prioritize and reinforce key relationships. GNPLLM and others (Tian et al., 2024; Shen et al., 2024; Barmettler et al., 2025; Xu et al., 2025a; Luo et al., 2025b) further fuse the learned graph embeddings with LLM representations, enriching the input features of downstream generation models. These methods have achieved significant accuracy improvements on fact-alignment tasks in dealing with paired entity relationships. However, because they are limited to modeling only binary edges, they inherently struggle to capture $n$ -ary relations that span three or more entities, resulting in the loss of key information in complex event scenarios involving multi-party interactions.

2.3 Heterogeneous and Hypergraph-Based Approaches

Moving beyond binary graphs, recent work has explored heterogeneous graph structures and hypergraphs to encode $n$ -ary relations. NodeRAG (Xu et al., 2025b) represents events as isolated nodes enriched with detailed type information, allowing the model to distinguish among entities, semantic units, and high-level summaries. HyperGraphRAG (Luo et al., 2025a) further generalizes this approach by introducing hyperedges that directly connect multiple entities within a single relational fact, thereby preserving the integrity of $n$ -ary events. HierarchicalRAG and similar systems (Chen et al., 2024; Huang et al., 2025; Jiao et al., 2025; Wang et al., 2025b; Zou et al., 2025) organize knowledge and enable hierarchical retrieval through multi-level or coarse-to-fine graph structures, while PikeRAG and others (Sun et al., 2019; Asai et al., 2019; Ma et al., 2022; Wang et al., 2025a; Wei et al., 2024; Six et al., 2025) align retrieval and generation more closely around the reasoning chain to highlight prominent multi-entity patterns in the graph. Although these methods substantially increase expressivity and multi-hop reasoning capabilities, they often fragment events across disconnected graph components, inflating the size and density of the index. This in turn raises retrieval overhead and can lead to incoherent generation when the model struggles to traverse highly entangled structures.

2.4 Semantic Unit Extraction and Global Context Modeling

Some prior efforts seek to preserve event coherence via chunk-level or sentence-level segmentation (Xu et al., 2025b; Michelmann et al., 2025; Yu et al., 2023; Brunato et al., 2023; Zhao et al., 2024; Liu et al., 2025; Ni et al., 2025) or overlapping sliding-window retrieval (Lewis et al., 2020; Izacard and Grave, 2020; Karpukhin et al., 2020; Wang et al., 2024a), but these remain fundamentally local strategies that lack corpus-wide consistency. No existing framework simultaneously (1) extracts semantically coherent units at a global level, (2) disambiguates and merges overlapping units across chunks, and (3) constructs a unified graph that balances coarse-grained $n$ -ary relations with fine-grained binary links. Our GOSU framework fills this gap: it first leverages LLMs to identify and globally filter semantic units (SUs) across all text blocks, then builds an SU-centric graph that preserves both detailed entity links and richer multi-entity structures, and finally applies a dual-phase RAG pipeline—fine-grained keyword retrieval followed by SU completion—to achieve enhanced factuality, coherence, and coverage.

3 Methodology

In this section, we present global-level optimized semantic unit-centric RAG (GOSU), as illustrated in Figure 2, which comprises three core components: global semantic unit optimization, semantic unit-centric knowledge graph construction, and semantic unit-centric retrieval and generation. Detailed descriptions of each component follow in Subsections 3.1, 3.2 and 3.3, respectively.

3.1 Global-level Semantic Unit Optimization

To ensure consistency and completeness of semantic units at the global level, GOSU introduces a “Global Semantic Unit Optimization” module at the very beginning of the pipeline, comprising three steps: initial extraction, global filtering and disambiguation, and merging with deduplication.

Initial Extraction

The external corpus consists of multiple documents. Each document $\mathcal{D}$ is segmented into length-controlled, semantically coherent text chunks $\mathcal{C}=\{c_{1},c_{2},\dots,c_{N}\}$ via a sliding-window algorithm $\mathrm{DocSplit}(\cdot)$ and convert each chunk into its vector representation:

[TABLE]

For each text chunk $c_{i}$ $(i=1,\ldots,N)$ , we employ the selected LLM to extract a set of candidate semantic units:

[TABLE]

where $s_{i,j}^{*}$ represents the $j$ -th candidate unit in chunk $i$ (an event or concept that satisfies completeness, coherence, and information-bearing capacity), $k_{i}\!=\!|\mathcal{S}_{i}^{*}|$ is the (chunk-dependent) number of candidates returned for $c_{i}$ , and $\mathrm{SemExt}_{\text{LLM}}(\cdot)$ is the LLM-based extraction procedure. Next, we merge the candidates from all chunks into a global pool:

[TABLE]

where $N$ is the total number of chunks and $\mathcal{S}^{*}$ is the global candidate set.

Global Filtering

We first conduct a coarse filtering step based on cosine similarity. Given a similarity threshold, we form candidate semantic-unit pairs and then refine them with the LLM to obtain fine-grained filtering:

[TABLE]

where $\mathcal{S}^{*}$ is the global candidate set of semantic units, $(s_{i}^{*},s_{j}^{*})$ is an ordered pair of distinct units, $\mathrm{SimJudge}(\cdot,\cdot;\tau)$ is a cosine-similarity-based binary decision function with threshold $\tau$ , and $\mathrm{LLMJudge}(\cdot,\cdot;\tau)$ denotes the LLM-based evaluator applied to pairs surviving the coarse stage.

Finally, we cluster and deduplicate the fine-grained pairs to obtain the refined semantic-unit set:

[TABLE]

where $\mathrm{Cluster}(\cdot)$ groups highly similar units into clusters and $\mathrm{Deduplicate}(\cdot)$ removes redundant elements within or across clusters.

Disambiguation and Merging with Deduplication

In addition to retrieving the corresponding text chunks via their identifiers, we augment the retrieval process with vector-similarity search, thereby providing the LLM with sufficient evidence to interpret and integrate the semantic units.

[TABLE]

Here $\mathcal{C}$ denotes the full chunk set, $\hat{s}_{i}$ is a refined semantic unit, $ID(c_{j})$ gives the chunk identifier of $c_{j}$ , $ChunkID(\hat{s}_{i})$ is the set of chunk IDs associated with $\hat{s}_{i}$ , and $\mathrm{SimRank}_{\mathcal{C}}(\hat{s}_{i})$ returns the similarity-ranked neighbors of $\hat{s}_{i}$ within $\mathcal{C}$ .

To avoid excessive retrieval, we prioritize ID-based lookup and then supplement it with similarity-based retrieval. The combined set is trimmed to a bounded size:

[TABLE]

where $\mathrm{Trim}(\cdot)$ restricts the number of retrieved chunks, $\tau$ is a predefined retrieval threshold, and $\mathrm{Len}(\cdot)$ returns the cardinality of a set.

Finally, the retrieved text is integrated, and the LLM is used to globally refine the semantic units:

[TABLE]

where $\hat{\mathcal{S}}$ is the set of deduplicated semantic units and $\mathrm{GloRef}_{\text{LLM}}(\cdot)$ denotes the LLM-based global refinement procedure.

3.2 Semantic Unit-Centric Knowledge Graph Construction

After completing the global semantic-unit optimization, GOSU uses the refined set $\mathcal{S}$ to construct the knowledge graph via three stages: entity–relation extraction, subgraph construction, and graph assembly.

Entity–Relation Extraction

Each global semantic unit $s_{i}\in\mathcal{S}$ is mapped to a graph node. For every $s_{i}$ , an LLM extracts fine-grained entities $\mathcal{E}_{i}$ and binary relations $\mathcal{R}_{i}$ , which are used to create entity nodes and relation edges while preserving context indices. Before assembly, we also extract locally identifiable entities and relations from each chunk to form a preliminary subgraph:

[TABLE]

where $\mathcal{C}$ is the set of all text chunks and $\mathrm{EntRelExt}(c_{i})$ returns a set of entity–relation assertions extracted from chunk $c_{i}$ .

For each semantic unit, entities and relations are further gathered from its supporting chunks:

[TABLE]

where $s_{i}$ is a semantic unit, $\mathrm{ID}(c_{j})$ gives the identifier of chunk $c_{j}$ , and $\mathrm{ChunkID}(s_{i})$ is the set of chunk IDs associated with $s_{i}$ ; $\mathrm{EntExt}_{\text{sem}}(s_{i},c_{j})$ and $\mathrm{RelExt}_{\text{sem}}(s_{i},c_{j})$ denote the LLM-based, $s_{i}$ -conditioned entity and relation extractors applied to the context of $c_{j}$ , respectively.

Subgraph Construction

For each global semantic unit $s_{i}$ , we first resolve ambiguity and remove duplicates among its associated entities $\mathcal{E}_{i}$ and relations $\mathcal{R}_{i}$ . We then build an entity–relation subgraph centered at $s_{i}$ , preserving both binary and higher-order ( $n$ -ary) structure:

[TABLE]

where $\mathrm{Link}_{\text{sub}}(s_{i},\mathcal{E}_{i},\mathcal{R}_{i})$ builds an $s_{i}$ -centric subgraph by linking $s_{i}$ to entities in $\mathcal{E}_{i}$ and instantiating relations in $\mathcal{R}_{i}$ (binary edges; $n$ -ary, if any, via a small relation node). $\mathrm{Deduplicate}(\cdot)$ merges co-referent entities/relations and removes duplicates. $\mathcal{G}_{i}^{*}$ denotes the resulting cleaned subgraph.

Knowledge Graph Assembly

Once all semantic-unit–centric subgraphs $\mathcal{G}_{i}^{*}$ are constructed, we assemble them into the final knowledge graph $\mathcal{G}$ by resolving cross-subgraph ambiguities and removing duplicates.

[TABLE]

where $\mathcal{G}^{*}=\{\mathcal{G}_{i}^{*}\}$ denotes the set of all subgraphs; $\mathrm{Link}_{\text{all}}(\mathcal{G}^{*})$ establishes cross-subgraph links by aligning co-referent entities/relations and adding inter-unit edges based on shared identifiers and context; $\mathrm{Deduplicate}(\cdot)$ then collapses remaining duplicates and resolves conflicts to produce $\mathcal{G}$ .

3.3 Semantic Unit-Centric Retrieval and Generation

Once the knowledge graph is built, GOSU adopts two complementary retrieval–generation pathways to jointly capture fine-grained binary relations and global $n$ -ary events.

Hierarchical Keyword Extraction

Building on LightRAG Guo et al. (2024), we perform hierarchical keyword extraction from the user query $q$ to support low-cost, effective retrieval of fine-grained relations. Beyond prior work that uses only low-level entity keywords $\mathcal{K}_{low}$ and high-level thematic keywords $\mathcal{K}_{high}$ , we introduce a mid-level “semantic-unit” tier $\mathcal{K}_{sem}$ , whose compact phrases encapsulate self-contained facts, relations, or events and thus improve retrieval precision with negligible overhead:

[TABLE]

where $q$ is the input query, $\mathcal{G}$ is the constructed knowledge graph (used for optional conditioning and normalization), $\mathcal{K}_{low}$ collects entity-/attribute-level terms (e.g., names, IDs, types), $\mathcal{K}_{sem}$ collects short semantic-unit phrases that summarize atomic facts or events, and $\mathcal{K}_{high}$ collects theme-/topic-level terms. Each $\mathcal{K}_{\bullet}$ is a (ranked) set of keywords yielded by the LLM extractor $\mathrm{KeyExt}_{\text{LLM}}(\cdot)$ .

Semantic-Unit Completion

We first use low- and high-level keywords to locate target entities and relations, then enrich them with weakly related but semantically relevant nodes, edges, and chunks. In parallel, we extract directly involved semantic units to cover coarse, multi-entity events that basic keyword matching may miss:

[TABLE]

where $\mathcal{K}_{\text{low}}$ / $\mathcal{K}_{\text{high}}$ are the low-/high-level keyword sets, $\mathcal{G}$ is the knowledge graph, $\mathcal{C}$ the chunk set, and $\mathcal{S}$ the semantic-unit set. $\mathrm{Retriever}_{\text{low}}$ returns a keyword-matched, finely scoped subgraph $\mathcal{G}_{\text{low}}$ (with associated chunks $\mathcal{C}_{\text{low}}$ and units $\mathcal{S}_{\text{low}}$ ), while $\mathrm{Retriever}_{\text{high}}$ returns a theme-oriented subgraph $\mathcal{G}_{\text{high}}$ (with $\mathcal{C}_{\text{high}}$ , $\mathcal{S}_{\text{high}}$ ), optionally expanded by lightweight graph heuristics (e.g., short-hop neighbors or similarity-ranked additions) to include weak but informative context.

When the candidates are insufficient, we augment them via similarity matching with semantic-level keywords:

[TABLE]

where $\mathcal{K}_{\text{sem}}$ is the semantic-level keyword set, $\mathrm{Retriever}_{\text{sem}}$ returns similarity-matched semantic units from $\mathcal{S}$ , $\mathrm{Trim}(\cdot)$ limits the set size, $\mathrm{Len}(\cdot)$ returns the set cardinality, and $\tau$ is a size threshold.

Next, to further enrich both fine-grained binary relations and coarse-grained $n$ -ary events, we traverse each semantic unit’s associated entities and relations:

[TABLE]

where $\mathcal{S}_{\text{all}}$ is the aggregated semantic-unit set (from Eq. (21)); $\mathrm{Retriever}_{\mathcal{S}_{\text{all}}}(\cdot)$ collects a subgraph $\mathcal{G}_{\text{sem}}$ and chunk set $\mathcal{C}_{\text{sem}}$ by following entity/relation links and context indices associated with units in $\mathcal{S}_{\text{all}}$ ; $\mathrm{Trim}(\cdot)$ limits the size of the returned graph/chunk sets to a preset budget; $\mathcal{G}_{\text{low}},\mathcal{G}_{\text{high}},\mathcal{C}_{\text{low}},\mathcal{C}_{\text{high}}$ are from Eqs. (18)–(19).

Fusion for Generation

Finally, we fuse retrieved snippets, semantic units, and graph context to guide the generator, producing an answer $R$ that cites fine-grained facts while maintaining global coherence across multi-entity events.

[TABLE]

where $\mathcal{S}_{\text{all}}$ is the aggregated semantic-unit set (Eq. (21)), $\mathcal{G}_{\text{all}}$ the aggregated subgraph (Eq. (24)), $\mathcal{C}_{\text{all}}$ the aggregated chunk set (Eq. (25)); $\mathrm{AnsGen}_{\text{LLM}}(\cdot)$ denotes the LLM-based response generator conditioned on these inputs.

4 Experiments

To comprehensively assess the effectiveness of GOSU on knowledge-intensive generation tasks, we conducted extensive experiments on several publicly available domain datasets, compared GOSU against a range of representative baselines, and performed systematic ablation studies.

4.1 Experimental Setup

Datasets.

To evaluate GOSU’s cross-vertical performance and follow established experimental protocols (Guo et al., 2024; Luo et al., 2025a), we selected four domain datasets from the UltraDomain benchmark (Qian et al., 2024): Agriculture, Computer Science (CS), Law, and Mix, together with a fifth dataset consisting of the most recent international hypertension guidelines (McCarthy et al., 2025). Additionally, following the generation methodology of Edge et al. (Edge et al., 2024), we employed an LLM to synthesize distinct RAG user profiles for each vertical and, from each profile’s perspective, generated multiple corpus-level queries that require holistic comprehension of the entire collection.

Baselines.

We compared GOSU against four state-of-the-art public RAG systems: NaiveRAG (Gao et al., 2023b), the standard baseline that retrieves fixed-length text chunks by similarity; LightRAG (Guo et al., 2024), a lightweight model that employs a two-tier retrieval strategy to balance recall and efficiency; HiRAG (Huang et al., 2025), a framework that leverages hierarchical knowledge representations to enhance semantic understanding and capture structural relations; and HyperGraphRAG (Luo et al., 2025a), a novel RAG approach that incorporates hypergraph structures to capture higher-order, multi-entity relations.

Evaluation Metrics.

To more thoroughly evaluate outcomes, particularly for queries that invoke complex, high-level semantics, we follow the evaluation protocol of KnowTuning et al. (Lyu et al., 2024; Guo et al., 2024; Edge et al., 2024) and adopt four assessment dimensions: Comprehensiveness, Diversity, Empowerment, and Overall. To ensure evaluation accuracy and mitigate potential positional bias (Zheng et al., 2023; Pezeshkpour and Hruschka, 2023), we employed an alternating pairwise comparison protocol in which candidate answers were presented in randomized left–right order and judged pairwise; for each evaluation dimension we selected the preferred answer based on these pairwise judgments. Specifically, we accept a comparison outcome only when the alternating pairwise judgments produce a consistent preference; if they do not, we treat the observed quality difference as being below the level of positional bias, deem the result inconclusive, and exclude it from further analysis. The final overall preference was determined by aggregating the rankings across the three primary dimensions (Comprehensiveness, Diversity, and Insightfulness), with ties resolved using the Overall Quality score.

Implementation Details.

We used GPT-4o-mini as the generative model and BGE-m3 for vector embeddings. To ensure experimental consistency and fair comparison, chunk size and all other retrieval- and generation-related hyperparameters were held identical across all methods.

4.2 Experimental Results

We compared GOSU against the baseline methods across each domain along multiple evaluation dimensions; the results are summarized in Table 1 and Table 2.

General Comparison.

As shown in Table 1, GOSU demonstrated stable performance across domains, consistently achieving higher win rates than all baselines on the four evaluation dimensions—Comprehensiveness, Diversity, Empowerment, and Overall—indicating its superior ability to produce more complete, varied, and practically useful responses.

Compared to NaiveRAG, GOSU achieved an average win-rate margin exceeding 50% across all evaluation dimensions, highlighting the superiority of graph-based RAG approaches over chunk-based retrieval in capturing complex semantic dependencies for knowledge-intensive tasks. Although GOSU is also a graph-augmented RAG method, it consistently outperforms LightRAG and HyperGraphRAG. This result indicates that, compared with approaches that rely solely on pairwise edges or purely $n$ -ary hyperedges, GOSU more effectively integrates fine-grained and coarse-grained semantic units and leverages them during retrieval and generation to produce higher-quality responses. Among the baselines, HiRAG exhibits the smallest performance gap relative to GOSU. This finding indicates that hierarchical knowledge representations do enhance semantic understanding and structural capture, and it further validates that GOSU’s strategy—driving the pipeline with globally completed semantic units—can achieve comparable or even superior results by explicitly integrating corpus-level semantic coherence with a dual-phase retrieval-and-generation process.

Experiments across multiple datasets highlight GOSU’s superior capability to integrate semantic information and to recognize structural variations across tasks and domains, with particularly strong gains in knowledge-intensive scenarios.

Ablation Study.

To rigorously assess the contribution of each component within the GOSU framework, we conducted extensive ablation studies at both the knowledge construction stage and the retrieval and generation stage (see Table 2).

Knowledge Construction Stage. We ablated the global-level semantic unit optimization (w/o GO). This modification produced pronounced declines in Comprehensiveness, Empowerment, and Overall scores, with the performance degradation particularly marked on the medical and legal benchmarks. These results indicate that the GO module is crucial for globally extracting and completing $n$ -ary relations and for facilitating the identification of relevant binary relations, thereby improving evidence aggregation and downstream generation quality.

Retrieval and Generation stage. We ablated each component of the three-stage retrieval mechanism—removing the entity layer (w/o EL), the relation layer (w/o RL), and the semantic-unit layer (w/o SL). All ablations produced measurable performance degradations, with the removal of the semantic-unit layer yielding the largest decline. This result validates the critical role of semantic units in completing coarse-grained information and supporting robust multi-entity evidence aggregation. Additionally, we ablated both the entity and relation layers (w/o EL & RL). The combined removal produced a marked performance degradation, further confirming that fine-grained knowledge—encoded by entity- and relation-level signals—is also indispensable for producing high-quality, factually grounded generations.

These experimental results demonstrate that each module deployed across GOSU’s pipeline is necessary to achieve optimal generation quality.

Analysis of Efficency and Cost.

We conducted a comprehensive cost comparison of GOSU and four baseline methods across the knowledge construction (offline) and retrieval and generation (online) phases. The experimental results are presented in Fig.3. We measured four cost metrics: token consumption for vector embeddings per text chunk (TPC), prompt-completion cost per text chunk (CPC), token consumption for vector embeddings per query (TPQ), and prompt-completion cost per query (CPQ).

During the knowledge construction phase, the integration of the global semantic unit optimization module increased GOSU’s embedding token consumption: total embedding token consumption (TPC) reached 29,560 tokens, substantially higher than HyperGraphRAG’s 4,940 tokens. In terms of CPC (cost per completion), GOSU incurred $0.00688 per completion—marginally higher than the three other baselines but still lower than HiRAG, which incurred$ 0.00806. During the generation phase, GOSU recorded a TPQ of 50 tokens, slightly higher than LightRAG (30 tokens) but lower than HyperGraphRAG (70 tokens). Additionally, for CPQ (cost per completion at query time), GOSU incurred $0.00720 per completion—comparable to LightRAG ($ 0.00600) and lower than HiRAG ( $0.00880). These results indicate that, although GOSU incurs additional cost to better accommodate knowledge-intensive tasks, the overhead remains within acceptable bounds. By employing a coarse-to-fine, two-tier filtering strategy to globally complete semantic units, GOSU balances structural and semantic representations, uncovers both fine-grained binary relations and coarse-grained$ n$-ary relations, and thereby effectively supports a three-stage retrieval-and-generation pipeline. Moreover, because the usage costs of many high-performance models, including those employed in this study, have been steadily decreasing and in some cases become free, the additional token overhead is acceptable.

Regarding efficiency, because GOSU performs pairwise similarity comparisons between semantic units during knowledge-graph construction, it incurs substantial computational overhead and leads to increased preprocessing time—especially on large corpora. This pairwise matching step is computationally intensive compared with simpler indexing strategies and represents the primary source of GOSU’s higher offline latency. Nonetheless, knowledge-graph construction is typically a one-time operation in most deployment scenarios, and subsequent system activity concentrates on retrieval and query handling; therefore, the upfront cost does not materially degrade ongoing online efficiency.

5 Conclusion

GOSU is a new framework that centers the RAG pipeline on semantic units which are optimized at the corpus level; by using these globally consistent units to guide the extraction of binary and $n$ -ary relations, GOSU achieves a balanced fusion of fine-grained entity links and coarse-grained multi-entity structures, resulting in more faithful, comprehensive, and coherent retrieval-augmented generation. Unlike approaches limited to individual text chunks, GOSU introduces a two-stage coarse-to-fine filtering mechanism to better summarize semantic units and extract structural information at the global corpus level. Driven by these semantic units, the SU-centric design and three-stage retrieval pipeline supplement low-level signals with tightly related high-level perspectives, yielding more coherent and semantically complete retrieval and generation. Extensive experiments demonstrate that GOSU consistently outperforms existing RAG pipelines across diverse vertical domains and related tasks. Although GOSU incurs additional computational and monetary costs to achieve improved performance, these trade-offs remain within acceptable bounds. While GOSU sacrifices some efficiency and incurs higher preprocessing and token costs to deliver superior retrieval and generation quality, our measurements show that these overheads are moderate and justified bythe overall substantial gains. Taken together, GOSU emphasizes the equal and complementary roles of fine- and coarse-grained relational modeling within graph-based RAG frameworks, delivering a scalable and high-quality approach for real-world AIGC tasks that require faithful, comprehensive, and coherent knowledge integration.

Limitations

Cross-domain experiments demonstrate that GOSU achieves substantial improvements in retrieval-augmented generation, but there remains room for further refinement.

•

First, the current method does not incorporate multimodal inputs and therefore cannot fully exploit knowledge embedded in images, tables, and other non-textual artifacts within multimodal corpora, which may lead to omission of important information.

•

Second, GOSU primarily focuses on $n$ -ary and binary relations and may lack the capacity to uncover deep chains of reasoning required for more complex inferential tasks.

•

Additionally, although GOSU enriches knowledge structure by centering on semantic units and employing a three-layer retrieval pipeline, there is still potential to further improve retrieval and generation efficiency.

Future work will investigate methods to overcome the limitations identified above.

Ethics Statement

This paper investigates RAG via GOSU, a semantic-unit–centric framework that globally optimizes semantic units to drive extraction of binary and $n$ -ary relations. We employ large language models for semantic-unit extraction and SU-centric graph construction, together with retrieval-augmented generation techniques to improve knowledge representation and generation quality. All data used in this study are publicly available and contain no personally identifiable or sensitive information; therefore, we believe the work adheres to ethical principles.

Bibliography87

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1Asai et al. (2019) Akari Asai, Kazuma Hashimoto, Hannaneh Hajishirzi, Richard Socher, and Caiming Xiong. 2019. Learning to retrieve reasoning paths over Wikipedia graph for question answering . Preprint , ar Xiv:1911.10470. Preprint, ar Xiv:1911.10470.
2Asai et al. (2023) Akari Asai, Zeqiu Wu, Yizhong Wang, Avirup Sil, and Hannaneh Hajishirzi. 2023. Self-RAG: Learning to retrieve, generate, and critique through Self-Reflection . Preprint , ar Xiv:2310.11511. Preprint, ar Xiv:2310.11511.
3Bai et al. (2024) Zechen Bai, Pichao Wang, Tianjun Xiao, Tong He, Zongbo Han, Zheng Zhang, and Mike Zheng Shou. 2024. Hallucination of Multimodal Large Language Models: A survey . Preprint , ar Xiv:2404.18930. Preprint, ar Xiv:2404.18930.
4Bang et al. (2025) Yejin Bang, Ziwei Ji, Alan Schelten, Anthony Hartshorn, Tara Fowler, Cheng Zhang, Nicola Cancedda, and Pascale Fung. 2025. Hallu Lens: LLM hallucination benchmark . Preprint , ar Xiv:2504.17550. Preprint, ar Xiv:2504.17550.
5Barmettler et al. (2025) Joel Barmettler, Abraham Bernstein, and Luca Rossetto. 2025. Concept Former: Towards efficient use of Knowledge-Graph embeddings in Large Language Models . Preprint , ar Xiv:2504.07624. Preprint, ar Xiv:2504.07624.
6Brunato et al. (2023) Dominique Brunato, Felice Dell’Orletta, Irene Dini, and Andrea Amelio Ravelli. 2023. Coherent or not? stressing a neural language model for discourse coherence in multiple languages . In Findings of the Association for Computational Linguistics: ACL 2023 , pages 10690–10700, Toronto, Canada. Association for Computational Linguistics.
7Chen et al. (2024) Weijie Chen, Ting Bai, Jinbo Su, Jian Luan, Wei Liu, and Chuan Shi. 2024. KG-Retriever: Efficient knowledge indexing for Retrieval-Augmented Large Language Models . Preprint , ar Xiv:2412.05547. Preprint, ar Xiv:2412.05547.
8Edge et al. (2024) Darren Edge, Ha Trinh, Newman Cheng, Joshua Bradley, Alex Chao, Apurva Mody, Steven Truitt, and Jonathan Larson. 2024. From local to global: A graph RAG approach to Query-Focused Summarization . Preprint , ar Xiv:2404.16130. Preprint, ar Xiv:2404.16130.