StructRAG: Boosting Knowledge Intensive Reasoning of LLMs via   Inference-time Hybrid Information Structurization

Zhuoqun Li; Xuanang Chen; Haiyang Yu; Hongyu Lin; Yaojie Lu; Qiaoyu; Tang; Fei Huang; Xianpei Han; Le Sun; Yongbin Li

arXiv:2410.08815·cs.CL·October 28, 2024

StructRAG: Boosting Knowledge Intensive Reasoning of LLMs via Inference-time Hybrid Information Structurization

Zhuoqun Li, Xuanang Chen, Haiyang Yu, Hongyu Lin, Yaojie Lu, Qiaoyu, Tang, Fei Huang, Xianpei Han, Le Sun, Yongbin Li

PDF

Open Access 1 Repo 1 Video

TL;DR

StructRAG is a novel framework that enhances LLMs' reasoning by converting scattered knowledge into structured formats, significantly improving performance on complex knowledge-intensive tasks.

Contribution

It introduces a method to identify optimal knowledge structures and reconstruct documents accordingly, advancing retrieval-augmented generation techniques.

Findings

01

Achieves state-of-the-art results on various tasks.

02

Excels in challenging knowledge-intensive scenarios.

03

Demonstrates effective knowledge structuring improves reasoning.

Abstract

Retrieval-augmented generation (RAG) is a key means to effectively enhance large language models (LLMs) in many knowledge-based tasks. However, existing RAG methods struggle with knowledge-intensive reasoning tasks, because useful information required to these tasks are badly scattered. This characteristic makes it difficult for existing RAG methods to accurately identify key information and perform global reasoning with such noisy augmentation. In this paper, motivated by the cognitive theories that humans convert raw information into various structured knowledge when tackling knowledge-intensive reasoning, we proposes a new framework, StructRAG, which can identify the optimal structure type for the task at hand, reconstruct original documents into this structured format, and infer answers based on the resulting structure. Extensive experiments across various knowledge-intensive tasks…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

li-z-q/structrag
noneOfficial

Videos

StructRAG: Boosting Knowledge Intensive Reasoning of LLMs via Inference-time Hybrid Information Structurization· slideslive

Taxonomy

TopicsSemantic Web and Ontologies · Topic Modeling · Data Quality and Management

MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Attention Dropout · Linear Layer · Weight Decay · WordPiece · Linear Warmup With Linear Decay · Dropout · Layer Normalization · Byte Pair Encoding · BERT