Improving Transformer-Kernel Ranking Model Using Conformer and Query   Term Independence

Bhaskar Mitra; Sebastian Hofstatter; Hamed Zamani; Nick Craswell

arXiv:2104.09393·cs.IR·April 20, 2021

Improving Transformer-Kernel Ranking Model Using Conformer and Query Term Independence

Bhaskar Mitra, Sebastian Hofstatter, Hamed Zamani, Nick Craswell

PDF

Open Access

TL;DR

This paper introduces a Conformer-enhanced Transformer-Kernel model with query term independence for improved document ranking, achieving better results on TREC benchmarks while maintaining efficiency.

Contribution

It proposes a novel Conformer layer and incorporates query term independence to scale Transformer-Kernel models for longer inputs and full retrieval tasks.

Findings

01

Outperforms TKL in retrieval quality

02

Beats all non-neural baselines on NDCG@10

03

Surpasses two-thirds of pretrained Transformer models

Abstract

The Transformer-Kernel (TK) model has demonstrated strong reranking performance on the TREC Deep Learning benchmark -- and can be considered to be an efficient (but slightly less effective) alternative to other Transformer-based architectures that employ (i) large-scale pretraining (high training cost), (ii) joint encoding of query and document (high inference cost), and (iii) larger number of Transformer layers (both high training and high inference costs). Since, a variant of the TK model -- called TKL -- has been developed that incorporates local self-attention to efficiently process longer input sequences in the context of document ranking. In this work, we propose a novel Conformer layer as an alternative approach to scale TK to longer input sequences. Furthermore, we incorporate query term independence and explicit term matching to extend the model to the full retrieval setting.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDomain Adaptation and Few-Shot Learning · Advanced Image and Video Retrieval Techniques · Image Retrieval and Classification Techniques

MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Absolute Position Encodings · Position-Wise Feed-Forward Layer · Softmax · Layer Normalization · Label Smoothing · Residual Connection · Byte Pair Encoding