DeepRefine: Agent-Compiled Knowledge Refinement via Reinforcement Learning

Haoyu Huang; Jiaxin Bai; Shujie Liu; Yang Wei; Hong Ting Tsang; Yisen Gao; Zhongwei Xie; Yufei Li; Yangqiu Song

arXiv:2605.10488·cs.CL·May 12, 2026

DeepRefine: Agent-Compiled Knowledge Refinement via Reinforcement Learning

Haoyu Huang, Jiaxin Bai, Shujie Liu, Yang Wei, Hong Ting Tsang, Yisen Gao, Zhongwei Xie, Yufei Li, Yangqiu Song

PDF

1 Repo 1 Models

TL;DR

DeepRefine is a reinforcement learning-based system that iteratively improves large language model-compiled knowledge bases by diagnosing and refining their content, leading to better downstream task performance.

Contribution

It introduces a novel LLM reasoning model for knowledge refinement that uses abductive diagnosis and reinforcement learning to enhance knowledge base quality.

Findings

01

DeepRefine achieves consistent improvements over baseline methods.

02

The system effectively localizes and corrects knowledge defects.

03

Reinforcement learning with GBD reward optimizes refinement policies.

Abstract

Agent-compiled knowledge bases provide persistent external knowledge for large language model (LLM) agents in open-ended, knowledge-intensive downstream tasks. Yet their quality is systematically limited by \emph{incompleteness}, \emph{incorrectness}, and \emph{redundancy}, manifested as missing evidence or cross-document links, low-confidence or imprecise claims, and ambiguous or coreference resolution issues. Such defects compound under iterative use, degrading retrieval fidelity and downstream task performance. We present \textbf{DeepRefine}, a general LLM-based reasoning model for \emph{agent-compiled knowledge refinement} that improves the quality of any pre-constructed knowledge bases with user queries to make it more suitable for the downstream tasks. DeepRefine performs multi-turn interactions with the knowledge base and conducts abductive diagnosis over interaction history,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

hkust-knowcomp/DeepRefine
github

Models

🤗
HaoyuHuang2/DeepRefine-v1-8B
model· 124 dl· ♡ 1
124 dl♡ 1

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.