CREAM: Continual Retrieval on Dynamic Streaming Corpora with Adaptive Soft Memory
HuiJeong Son, Hyeongu Kang, Sunho Kim, Subeen Ho, SeongKu Kang, Dongha Lee, Susik Yoon

TL;DR
CREAM is a self-supervised continual retrieval framework that adapts to evolving streaming data without labels, significantly improving retrieval accuracy on dynamic corpora.
Contribution
CREAM introduces a novel self-supervised approach for memory-based continual retrieval that effectively handles unseen topics without ground-truth labels.
Findings
Outperforms existing methods by 27.79% in Success@5
Achieves 44.5% improvement in Recall@10
Matches or exceeds supervised method performance
Abstract
Information retrieval (IR) in dynamic data streams is a crucial task, as shifts in data distribution degrade the performance of AI-powered IR systems. To mitigate this issue, memory-based continual learning has been widely adopted for IR. However, existing methods rely on a fixed set of queries with ground-truth documents, which limits generalization to unseen data, making them impractical for real-world applications. To enable more effective learning with unseen topics of a new corpus without ground-truth labels, we propose CREAM, a self-supervised framework for memory-based continual retrieval. CREAM captures the evolving semantics of streaming queries and documents into dynamically structured soft memory and leverages it to adapt to both seen and unseen topics in an unsupervised setting. We realize this through three key techniques: fine-grained similarity estimation, regularized…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsInformation Retrieval and Search Behavior · Image Retrieval and Classification Techniques · Domain Adaptation and Few-Shot Learning
