Exploring Classic and Neural Lexical Translation Models for Information   Retrieval: Interpretability, Effectiveness, and Efficiency Benefits

Leonid Boytsov; Zico Kolter

arXiv:2102.06815·cs.CL·March 19, 2021

Exploring Classic and Neural Lexical Translation Models for Information Retrieval: Interpretability, Effectiveness, and Efficiency Benefits

Leonid Boytsov, Zico Kolter

PDF

2 Repos

TL;DR

This paper investigates neural and classic lexical translation models for information retrieval, demonstrating that neural Model 1 enhances effectiveness, interpretability, and efficiency when combined with BERT embeddings, and achieves top results on MS MARCO leaderboard.

Contribution

It introduces a neural Model 1 layer for ranking that maintains accuracy and efficiency, and improves interpretability and sequence length handling in neural IR models.

Findings

01

Neural Model 1 with BERT does not reduce accuracy or efficiency.

02

Context-free neural Model 1 is CPU-efficient but less effective.

03

Achieved top results on MS MARCO leaderboard using Model 1.

Abstract

We study the utility of the lexical translation model (IBM Model 1) for English text retrieval, in particular, its neural variants that are trained end-to-end. We use the neural Model1 as an aggregator layer applied to context-free or contextualized query/document embeddings. This new approach to design a neural ranking system has benefits for effectiveness, efficiency, and interpretability. Specifically, we show that adding an interpretable neural Model 1 layer on top of BERT-based contextualized embeddings (1) does not decrease accuracy and/or efficiency; and (2) may overcome the limitation on the maximum sequence length of existing BERT models. The context-free neural Model 1 is less effective than a BERT-based ranking model, but it can run efficiently on a CPU (without expensive index-time precomputation or query-time operations on large tensors). Using Model 1 we produced best…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsLinear Layer · Residual Connection · Weight Decay · Multi-Head Attention · Refunds@Expedia|||How do I get a full refund from Expedia? · Attention Dropout · Layer Normalization · WordPiece · Dense Connections · Adam