Maps Search Misspelling Detection Leveraging Domain-Augmented Contextual Representations
Yutong Li

TL;DR
This paper explores domain-specific contextual representations for detecting misspellings in map search queries, showing that fine-tuned BERT models can outperform other models, with findings on model performance and data generation.
Contribution
It introduces a four-stage modeling approach for misspelling detection in map search, highlighting the effectiveness of single-domain fine-tuned BERT over other models.
Findings
Single-domain fine-tuned BERT outperforms cross-domain models.
Advanced BERT variants like RoBERTa do not always outperform BERT.
Data generation algorithm presents a notable breakthrough.
Abstract
Building an independent misspelling detector and serve it before correction can bring multiple benefits to speller and other search components, which is particularly true for the most commonly deployed noisy-channel based speller systems. With rapid development of deep learning and substantial advancement in contextual representation learning such as BERTology, building a decent misspelling detector without having to rely on hand-crafted features associated with noisy-channel architecture becomes more-than-ever accessible. However BERTolgy models are trained with natural language corpus but Maps Search is highly domain specific, would BERTology continue its success. In this paper we design 4 stages of models for misspeling detection ranging from the most basic LSTM to single-domain augmented fine-tuned BERT. We found for Maps Search in our case, other advanced BERTology family model…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Spam and Phishing Detection · Domain Adaptation and Few-Shot Learning
MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Multi-Head Attention · Attention Is All You Need · Linear Layer · WordPiece · Attention Dropout · Residual Connection · Dropout · Adam · Dense Connections
