Continual Learning of Long Topic Sequences in Neural Information Retrieval
Thomas Gerald, Laure Soulier

TL;DR
This paper investigates the ability of neural information retrieval models to continually learn from long streams of evolving topics, highlighting challenges like catastrophic forgetting and providing insights for future model improvements.
Contribution
It introduces a new dataset for modeling long topic streams and analyzes the transfer capacity and limitations of recent neural IR models in continual learning scenarios.
Findings
Catastrophic forgetting occurs under certain conditions such as high task similarity.
Model performance varies with text length and learning methods.
Insights suggest directions for designing more robust continual learning IR models.
Abstract
In information retrieval (IR) systems, trends and users' interests may change over time, altering either the distribution of requests or contents to be recommended. Since neural ranking approaches heavily depend on the training data, it is crucial to understand the transfer capacity of recent IR approaches to address new domains in the long term. In this paper, we first propose a dataset based upon the MSMarco corpus aiming at modeling a long stream of topics as well as IR property-driven controlled settings. We then in-depth analyze the ability of recent neural IR models while continually learning those streams. Our empirical study highlights in which particular cases catastrophic forgetting occurs (e.g., level of similarity between tasks, peculiarities on text length, and ways of learning models) to provide future directions in terms of model design.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsInformation Retrieval and Search Behavior · Advanced Graph Neural Networks · Topic Modeling
