SCOUT: Active Information Foraging for Long-Text Understanding with Decoupled Epistemic States
Zhenliang Zhang, Wenqing Wang, Yong Hu, Yaming Yang, Jiaheng Gao, Chen Shen, Xiaojun Wan

TL;DR
SCOUT introduces an active information foraging approach for long-text understanding, enabling efficient reasoning by focusing on query-relevant information and reducing token use.
Contribution
It presents a novel paradigm that shifts from passive to active exploration, improving efficiency and stability in long-text understanding tasks.
Findings
SCOUT matches state-of-the-art models with up to 8x fewer tokens.
It maintains performance as context length increases.
SCOUT reduces computational costs in long-text reasoning.
Abstract
Long-Text Understanding (LTU) at million-token scale requires balancing reasoning fidelity with computational efficiency. Frontier long-context LLMs can process millions of token contexts end-to-end, but they suffer from high token consumption and attention dilution. In parallel, specialized LTU agents often sacrifice fidelity through task-agnostic abstractions like graph construction or indexing. We identify a key insight for LTU: query-relevant information is typically sparse relative to the full document, so effective reasoning should rely on a query-sufficient subset rather than the entire context. To address this, we propose SCOUT, a new paradigm for LTU that shifts from passive processing to active information foraging. It treats the document as an explorable environment and answers from a compact, provenance-grounded epistemic state. Guided by state-level gap diagnosis, SCOUT…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
