Scalable Exploration for Neural Online Learning to Rank with Perturbed Feedback
Yiling Jia, Hongning Wang

TL;DR
This paper introduces an efficient bootstrapping-based exploration strategy for neural online learning to rank, significantly reducing computational costs while maintaining effectiveness, enabling practical deployment of neural rankers in interactive settings.
Contribution
It proposes a novel ensemble-based exploration method for neural OL2R that avoids explicit confidence set construction, improving efficiency and scalability.
Findings
Outperforms state-of-the-art OL2R algorithms on benchmark datasets.
Achieves theoretical guarantees with reduced computational overhead.
Demonstrates practical applicability in online neural ranking tasks.
Abstract
Deep neural networks (DNNs) demonstrate significant advantages in improving ranking performance in retrieval tasks. Driven by the recent technical developments in optimization and generalization of DNNs, learning a neural ranking model online from its interactions with users becomes possible. However, the required exploration for model learning has to be performed in the entire neural network parameter space, which is prohibitively expensive and limits the application of such online solutions in practice. In this work, we propose an efficient exploration strategy for online interactive neural ranker learning based on the idea of bootstrapping. Our solution employs an ensemble of ranking models trained with perturbed user click feedback. The proposed method eliminates explicit confidence set construction and the associated computational overhead, which enables the online neural…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and ELM · Domain Adaptation and Few-Shot Learning · Stochastic Gradient Optimization Techniques
