Complementing Lexical Retrieval with Semantic Residual Embedding

Luyu Gao; Zhuyun Dai; Tongfei Chen; Zhen Fan; Benjamin Van Durme,; Jamie Callan

arXiv:2004.13969·cs.IR·March 30, 2021·59 cites

Complementing Lexical Retrieval with Semantic Residual Embedding

Luyu Gao, Zhuyun Dai, Tongfei Chen, Zhen Fan, Benjamin Van Durme,, Jamie Callan

PDF

Open Access

TL;DR

This paper introduces CLEAR, a retrieval model combining lexical and semantic matching using residual embeddings, improving accuracy and efficiency in reranking pipelines.

Contribution

CLEAR's novel residual-based embedding learning explicitly captures semantics beyond lexical matching, enhancing retrieval performance.

Findings

01

CLEAR outperforms state-of-the-art models

02

Significant improvements in reranking accuracy

03

Enhanced efficiency in retrieval pipelines

Abstract

This paper presents CLEAR, a retrieval model that seeks to complement classical lexical exact-match models such as BM25 with semantic matching signals from a neural embedding matching model. CLEAR explicitly trains the neural embedding to encode language structures and semantics that lexical retrieval fails to capture with a novel residual-based embedding learning method. Empirical evaluations demonstrate the advantages of CLEAR over state-of-the-art retrieval models, and that it can substantially improve the end-to-end accuracy and efficiency of reranking pipelines.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Multimodal Machine Learning Applications