KnowTrace: Bootstrapping Iterative Retrieval-Augmented Generation with Structured Knowledge Tracing

Rui Li; Quanyu Dai; Zeyu Zhang; Xu Chen; Zhenhua Dong; Ji-Rong Wen

arXiv:2505.20245·cs.CL·May 27, 2025

KnowTrace: Bootstrapping Iterative Retrieval-Augmented Generation with Structured Knowledge Tracing

Rui Li, Quanyu Dai, Zeyu Zhang, Xu Chen, Zhenhua Dong, Ji-Rong Wen

PDF

Open Access 1 Repo

TL;DR

KnowTrace introduces a structured knowledge graph approach to improve retrieval-augmented generation, reducing context overload and enhancing multi-step reasoning in large language models for complex question answering.

Contribution

It proposes a novel framework that organizes retrieved information into knowledge triplets and uses self-bootstrapping to improve reasoning quality.

Findings

01

Outperforms existing methods on three multi-hop question answering benchmarks.

02

Structured knowledge organization reduces context overload.

03

Bootstrapping further improves reasoning accuracy.

Abstract

Recent advances in retrieval-augmented generation (RAG) furnish large language models (LLMs) with iterative retrievals of relevant information to handle complex multi-hop questions. These methods typically alternate between LLM reasoning and retrieval to accumulate external information into the LLM's context. However, the ever-growing context inherently imposes an increasing burden on the LLM to perceive connections among critical information pieces, with futile reasoning steps further exacerbating this overload issue. In this paper, we present KnowTrace, an elegant RAG framework to (1) mitigate the context overload and (2) bootstrap higher-quality multi-step reasoning. Instead of simply piling the retrieved contents, KnowTrace autonomously traces out desired knowledge triplets to organize a specific knowledge graph relevant to the input question. Such a structured workflow not only…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

rui9812/knowtrace
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling

MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Attention Is All You Need · Linear Layer · Attention Dropout · Softmax · WordPiece · Weight Decay · Multi-Head Attention · Layer Normalization · Byte Pair Encoding