Maximally-Informative Retrieval for State Space Model Generation

Evan Becker; Benjamin Bowman; Matthew Trager; Tian Yu Liu; Luca Zancato; Wei Xia; Stefano Soatto

arXiv:2506.12149·cs.CL·June 17, 2025

Maximally-Informative Retrieval for State Space Model Generation

Evan Becker, Benjamin Bowman, Matthew Trager, Tian Yu Liu, Luca Zancato, Wei Xia, Stefano Soatto

PDF

Open Access

TL;DR

This paper introduces RICO, a gradient-based retrieval method for state space models that optimizes document selection at inference time, improving answer accuracy without additional fine-tuning.

Contribution

We propose Retrieval In-Context Optimization (RICO), a novel gradient-based retrieval approach that directly leverages model feedback to select relevant data for inference.

Findings

01

RICO achieves comparable performance to BM25 without fine-tuning.

02

Our method often outperforms fine-tuned dense retrievers like E5.

03

Theoretically, top-k retrieval with gradients approximates our optimization.

Abstract

Given a query and dataset, the optimal way of answering the query is to make use all the information available. Modern LLMs exhibit impressive ability to memorize training data, but data not deemed important during training is forgotten, and information outside that training set cannot be made use of. Processing an entire dataset at inference time is infeasible due to the bounded nature of model resources (e.g. context size in transformers or states in state space models), meaning we must resort to external memory. This constraint naturally leads to the following problem: How can we decide based on the present query and model, what among a virtually unbounded set of known data matters for inference? To minimize model uncertainty for a particular query at test-time, we introduce Retrieval In-Context Optimization (RICO), a retrieval method that uses gradients from the LLM itself to learn…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTime Series Analysis and Forecasting