InfoFlow KV: Information-Flow-Aware KV Recomputation for Long Context
Xin Teng, Canyu Zhang, Shaoyi Zheng, Danyang Zhuo, Tianyi Zhou, Shengjie Wang

TL;DR
This paper introduces InfoFlow KV, a novel method for selective key-value recomputation in long-context retrieval-augmented generation, leveraging information flow modeling to improve efficiency and effectiveness.
Contribution
It models KV recomputation as an information flow problem and proposes an attention-norm signal to identify influential tokens, enhancing long-context generation.
Findings
Achieves consistent performance improvements over prior methods.
Effectively identifies semantically relevant tokens for recomputation.
Demonstrates benefits on both LLM and VLM benchmarks.
Abstract
Retrieval-augmented generation (RAG) for long-context question answering is bottlenecked by inference-time prefilling over large retrieved contexts. A common strategy is to precompute key-value (KV) caches for individual documents and selectively recompute a small subset of tokens to restore global causal dependencies, but existing methods rely on heuristics or representation discrepancies without modeling whether selected tokens can effectively influence generation. We cast selective KV recomputation as an information flow problem and show that a simple attention-norm signal from the query reliably identifies tokens that are both semantically relevant and structurally positioned to propagate information, when computed under an inference-consistent RoPE geometry. We therefore reconstruct global positional assignments for retrieved chunks and introduce an information-flow-guided chunk…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Information Retrieval and Search Behavior · Data Quality and Management
