TL;DR
This paper introduces a session-aware query auto-completion method modeled as an extreme multi-label ranking problem, achieving high accuracy and low latency for real-time search suggestions by leveraging session context.
Contribution
The paper proposes a novel adaptation of extreme multi-label ranking algorithms for session-aware query auto-completion, significantly improving ranking performance and latency.
Findings
3.9x improvement in Mean Reciprocal Rank (MRR) over baseline
Inference latency maintained under 10 ms
33% improvement in suggestion acceptance rate in online deployment
Abstract
Query auto-completion (QAC) is a fundamental feature in search engines where the task is to suggest plausible completions of a prefix typed in the search bar. Previous queries in the user session can provide useful context for the user's intent and can be leveraged to suggest auto-completions that are more relevant while adhering to the user's prefix. Such session-aware QACs can be generated by recent sequence-to-sequence deep learning models; however, these generative approaches often do not meet the stringent latency requirements of responding to each user keystroke. Moreover, these generative approaches pose the risk of showing nonsensical queries. In this paper, we provide a solution to this problem: we take the novel approach of modeling session-aware QAC as an eXtreme Multi-Label Ranking (XMR) problem where the input is the previous query in the session and the user's current…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
