IntrAgent: An LLM Agent for Content-Grounded Information Retrieval through Literature Review

Fengbo Ma; Zixin Rao; Xiaoting Li; Zhetao Chen; Hongyue Sun; Yiping Zhao; Xianyan Chen; Zhen Xiang

arXiv:2604.22861·cs.IR·April 28, 2026

IntrAgent: An LLM Agent for Content-Grounded Information Retrieval through Literature Review

Fengbo Ma, Zixin Rao, Xiaoting Li, Zhetao Chen, Hongyue Sun, Yiping Zhao, Xianyan Chen, Zhen Xiang

PDF

1 Datasets

TL;DR

IntrAgent is an LLM-based agent designed to automate precise, content-grounded information retrieval from scientific literature, mimicking human reading behaviors through a two-stage process and evaluated on a new benchmark.

Contribution

The paper introduces IntrAgent, a novel LLM agent with a two-stage pipeline for fine-grained literature-based information retrieval, and presents IntraBench, a new benchmark for evaluation.

Findings

01

IntrAgent outperforms state-of-the-art RAG and research-agent baselines by 13.2% on average.

02

The two-stage pipeline effectively mimics human reading and reasoning behaviors.

03

IntraBench provides a rigorous, expert-annotated dataset across five STEM domains.

Abstract

Scientific research relies on accurate information retrieval from literature to support analytical decisions. In this work, we introduce a new task, INformation reTRieval through literAture reVIEW (IntraView), which aims to automate fine-grained information retrieval faithfully grounded in the provided content in response to research-driven queries, and propose IntrAgent, an LLM-based agent that addresses this challenging task. In particular, IntrAgent is designed to mimic human behaviors when reading literature for information retrieval -- identifying relevant sections and then iteratively extracting key details to refine the retrieved information. It follows a two-stage pipeline: a Section Ranking stage that prioritizes relevant literature sections through structural-knowledge-enabled reasoning, and an Iterative Reading stage that continuously extracts details and synthesizes them…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Datasets

IntrAgent/IntraBench
dataset· 163 dl
163 dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.