Efficient Neural Query Auto Completion
Sida Wang, Weiwei Guo, Huiji Gao, Bo Long

TL;DR
This paper introduces an efficient neural query auto completion system that balances high accuracy and low latency by utilizing effective context modeling and an unnormalized language model, improving user experience in search applications.
Contribution
The paper presents a novel neural QAC system that significantly reduces latency while enhancing candidate relevance through advanced context modeling and semantic understanding.
Findings
Increases recall for unseen prefixes.
Reduces latency by approximately 95%.
Achieves better ranking performance than existing neural methods.
Abstract
Query Auto Completion (QAC), as the starting point of information retrieval tasks, is critical to user experience. Generally it has two steps: generating completed query candidates according to query prefixes, and ranking them based on extracted features. Three major challenges are observed for a query auto completion system: (1) QAC has a strict online latency requirement. For each keystroke, results must be returned within tens of milliseconds, which poses a significant challenge in designing sophisticated language models for it. (2) For unseen queries, generated candidates are of poor quality as contextual information is not fully utilized. (3) Traditional QAC systems heavily rely on handcrafted features such as the query candidate frequency in search logs, lacking sufficient semantic understanding of the candidate. In this paper, we propose an efficient neural QAC system with…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
