Subword Language Model for Query Auto-Completion

Gyuwan Kim

arXiv:1909.00599·cs.CL·September 4, 2019

Subword Language Model for Query Auto-Completion

Gyuwan Kim

PDF

Open Access 1 Repo

TL;DR

This paper introduces a subword language model for query auto-completion that significantly speeds up generation while maintaining quality, using novel algorithms and a new evaluation metric.

Contribution

It presents a subword-based approach with a retrace algorithm, reranking method, and a new metric, improving speed and interpretability over character-level models.

Findings

01

Achieves up to 2.5x faster query completion

02

Maintains similar quality to character-level models

03

Introduces mean recoverable length (MRL) metric

Abstract

Current neural query auto-completion (QAC) systems rely on character-level language models, but they slow down when queries are long. We present how to utilize subword language models for the fast and accurate generation of query completion candidates. Representing queries with subwords shorten a decoding length significantly. To deal with issues coming from introducing subword language model, we develop a retrace algorithm and a reranking method by approximate marginalization. As a result, our model achieves up to 2.5 times faster while maintaining a similar quality of generated results compared to the character-level baseline. Also, we propose a new evaluation metric, mean recoverable length (MRL), measuring how many upcoming characters the model could complete correctly. It provides more explicit meaning and eliminates the need for prefix length sampling for existing rank-based…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

clovaai/subword-qac
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Multimodal Machine Learning Applications