TL;DR
This paper introduces a Deep Reinforcement Learning approach called Double-Rank Model (DRM) for complex ranking scenarios where both the relevance of items and their display order are unknown, outperforming existing methods.
Contribution
The paper proposes a novel DRM method that learns both document ranking and display layout simultaneously using weak reward signals, addressing limitations of traditional ranking models.
Findings
DRM outperforms existing ranking methods in complex settings.
The method effectively learns display layouts and rankings from weak rewards.
Significant improvements in ranking quality in unknown display order scenarios.
Abstract
Learning to Rank has traditionally considered settings where given the relevance information of objects, the desired order in which to rank the objects is clear. However, with today's large variety of users and layouts this is not always the case. In this paper, we consider so-called complex ranking settings where it is not clear what should be displayed, that is, what the relevant items are, and how they should be displayed, that is, where the most relevant items should be placed. These ranking settings are complex as they involve both traditional ranking and inferring the best display order. Existing learning to rank methods cannot handle such complex ranking settings as they assume that the display order is known beforehand. To address this gap we introduce a novel Deep Reinforcement Learning method that is capable of learning complex rankings, both the layout and the best ranking…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
