Learning to Query History: Nonstationary Classification via Learned Retrieval

Jimmy Gammell; Bishal Thapaliya; Yoon Jung; Riyasat Ohib; Bilel Fehri; Deepayan Chakrabarti

arXiv:2604.07027·cs.LG·April 9, 2026

Learning to Query History: Nonstationary Classification via Learned Retrieval

Jimmy Gammell, Bishal Thapaliya, Yoon Jung, Riyasat Ohib, Bilel Fehri, Deepayan Chakrabarti

PDF

TL;DR

This paper introduces a method that conditions classifiers on historical data sequences using learned retrieval, improving robustness to nonstationarity and distribution shifts in practical classification tasks.

Contribution

It proposes a scalable, learned retrieval mechanism for nonstationary classification that leverages historical data sequences during training and deployment.

Findings

01

Improved robustness to distribution shift in synthetic and real-world benchmarks.

02

Scales predictably with sequence length in VRAM usage.

03

Effective in nonstationary classification scenarios.

Abstract

Nonstationarity is ubiquitous in practical classification settings, leading deployed models to perform poorly even when they generalize well to holdout sets available at training time. We address this by reframing nonstationary classification as time series prediction: rather than predicting from the current input alone, we condition the classifier on a sequence of historical labeled examples that extends beyond the training cutoff. To scale to large sequences, we introduce a learned discrete retrieval mechanism that samples relevant historical examples via input-dependent queries, trained end-to-end with the classifier using a score-based gradient estimator. This enables the full corpus of historical data to remain on an arbitrary filesystem during training and deployment. Experiments on synthetic benchmarks and Amazon Reviews '23 (electronics category) show improved robustness to…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.