Graph of Records: Boosting Retrieval Augmented Generation for Long-context Summarization with Graphs

Haozhen Zhang; Tao Feng; Jiaxuan You

arXiv:2410.11001·cs.CL·May 30, 2025

Graph of Records: Boosting Retrieval Augmented Generation for Long-context Summarization with Graphs

Haozhen Zhang, Tao Feng, Jiaxuan You

PDF

Open Access 1 Repo 1 Video 3 Reviews

TL;DR

This paper introduces Graph of Records (GoR), a novel method that enhances retrieval-augmented generation for long-context summarization by leveraging historical LLM responses through graph neural networks, leading to improved summarization performance.

Contribution

The paper proposes GoR, which constructs a graph from retrieved texts and LLM responses, using GNNs and BERTScore-based training to better utilize historical responses for long-context summarization.

Findings

01

GoR outperforms 12 baselines on four datasets.

02

Achieves up to 19% improvement in Rouge scores.

03

Effectively leverages historical responses for better summaries.

Abstract

Retrieval-augmented generation (RAG) has revitalized Large Language Models (LLMs) by injecting non-parametric factual knowledge. Compared with long-context LLMs, RAG is considered an effective summarization tool in a more concise and lightweight manner, which can interact with LLMs multiple times using diverse queries to get comprehensive responses. However, the LLM-generated historical responses, which contain potentially insightful information, are largely neglected and discarded by existing approaches, leading to suboptimal results. In this paper, we propose $graph of records$ ( $GoR$ ), which leverages historical responses generated by LLMs to enhance RAG for long-context global summarization. Inspired by the $retrieve-then-generate$ paradigm of RAG, we construct a graph by establishing an edge between the retrieved text chunks and the corresponding…

Peer Reviews

Decision·Submitted to ICLR 2025

Reviewer 01Rating 6Confidence 5

Strengths

1. The paper is well-structured, with clear explanations and logically organized sections, making it easy for readers to follow the methodology and findings. 2. The authors test GoR on four long-context summarization datasets, and the comprehensive evaluation against a variety of baselines (including both sparse and dense retrievers) demonstrates the robustness and generalizability of the method. 3. The proposed approach builds upon existing RAG techniques, making it relatively easy to impleme

Weaknesses

I didn’t find significant weaknesses in this paper. The entire architecture is built on existing modules, making the proposed framework both sound and replicable. However, this reliance on established methods might also be a limitation, as the framework feels less innovative or exciting despite being well-presented with informative experimental results.

Reviewer 02Rating 6Confidence 4

Strengths

1. Introduces an innovative approach to integrate LLMs' intermediate summaries with original document chunks in a structured graph representation 2. Implements an efficient sparse connectivity strategy using top-K similar textual chunks, reducing computational complexity while maintaining information flow 3. This hierarchical structure effectively bridges local and global document understanding 4. Develops a well-motivated semantic alignment mechanism for the GNN by leveraging BERTScore-based si

Weaknesses

1. Clarity and Presentation Issues: - **Problem Definition** : Lacks a dedicated subsection that formally defines long-context summarization; Missing explicit comparison between GoR and existing RAG approaches for long-context summarization in the introduction; Would benefit from a clear positioning diagram or framework overview - **Technical Clarity** : Graph construction description (lines 147-194) lacks sufficient detail and clear visualization; Equation 3's retrieval mechanism is ambiguous

Reviewer 03Rating 5Confidence 3

Strengths

Innovative Use of Historical Responses: The paper introduces a novel approach by leveraging LLM-generated historical responses for enhancing RAG, which is largely neglected by existing methods. This approach enriches the summarization process, potentially increasing the relevance and depth of generated summaries.

Weaknesses

__I.__ Computational Efficiency Evaluation: The paper lacks experimental validation of computational efficiency. The reliance on LLM-generated responses and retrieved chunk graphs, combined with the incorporation of graph neural networks and BERTScore-based objectives, could introduce substantial computational overhead. __II.__ Dependency on Data Quality: The effectiveness of GoR may rely heavily on the quality and coherence of the historical responses generated by LLMs. Inconsistencies in thes

Code & Models

Repositories

ulab-uiuc/gor
pytorchOfficial

Videos

Graph of Records: Boosting Retrieval Augmented Generation for Long-context Summarization with Graphs· underline

Taxonomy

TopicsData Management and Algorithms · Topic Modeling · Data Quality and Management

MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Linear Layer · Multi-Head Attention · Dense Connections · WordPiece · Residual Connection · Linear Warmup With Linear Decay · Dropout · Layer Normalization · Adam