Dealing with Typos for BERT-based Passage Retrieval and Ranking
Shengyao Zhuang, Guido Zuccon

TL;DR
This paper investigates how typos in queries affect BERT-based passage retrieval and ranking, and proposes a typos-aware training framework to improve robustness and effectiveness in such scenarios.
Contribution
It introduces a simple yet effective typos-aware training method for Dense Retriever and BERT re-ranker, enhancing their robustness against query typos in passage retrieval tasks.
Findings
Typos significantly reduce retrieval and ranking effectiveness.
Typos-aware training improves robustness and effectiveness.
Models trained with typos-aware framework outperform standard models.
Abstract
Passage retrieval and ranking is a key task in open-domain question answering and information retrieval. Current effective approaches mostly rely on pre-trained deep language model-based retrievers and rankers. These methods have been shown to effectively model the semantic matching between queries and passages, also in presence of keyword mismatch, i.e. passages that are relevant to a query but do not contain important query keywords. In this paper we consider the Dense Retriever (DR), a passage retrieval method, and the BERT re-ranker, a popular passage re-ranking method. In this context, we formally investigate how these models respond and adapt to a specific type of keyword mismatch -- that caused by keyword typos occurring in queries. Through empirical investigation, we find that typos can lead to a significant drop in retrieval and ranking effectiveness. We then propose a simple…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Text and Document Classification Technologies
MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Refunds@Expedia|||How do I get a full refund from Expedia? · Dropout · Adam · Weight Decay · Softmax · Residual Connection · WordPiece
