TL;DR
This paper introduces new models for Online Learning to Rank that balance the tradeoff between learning speed and ranking quality, enabling faster training without sacrificing final ranking performance.
Contribution
It proposes Sim-MGD, a fast ranking model based on document similarities, and C-MGD, a cascading approach that combines fast learning with high-quality convergence.
Findings
Sim-MGD converges rapidly, improving user experience.
C-MGD achieves fast learning with convergence comparable to state-of-the-art models.
The models open new avenues for designing OLTR systems without the speed-quality tradeoff.
Abstract
In Online Learning to Rank (OLTR) the aim is to find an optimal ranking model by interacting with users. When learning from user behavior, systems must interact with users while simultaneously learning from those interactions. Unlike other Learning to Rank (LTR) settings, existing research in this field has been limited to linear models. This is due to the speed-quality tradeoff that arises when selecting models: complex models are more expressive and can find the best rankings but need more user interactions to do so, a requirement that risks frustrating users during training. Conversely, simpler models can be optimized on fewer interactions and thus provide a better user experience, but they will converge towards suboptimal rankings. This tradeoff creates a deadlock, since novel models will not be able to improve either the user experience or the final convergence point, without…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
