VIRT: Improving Representation-based Models for Text Matching through   Virtual Interaction

Dan Li; Yang Yang; Hongyin Tang; Jingang Wang; Tong Xu; Wei Wu; Enhong; Chen

arXiv:2112.04195·cs.CL·October 20, 2022·5 cites

VIRT: Improving Representation-based Models for Text Matching through Virtual Interaction

Dan Li, Yang Yang, Hongyin Tang, Jingang Wang, Tong Xu, Wei Wu, Enhong, Chen

PDF

Open Access

TL;DR

VIRT introduces a virtual interaction mechanism that distills interactive knowledge into Siamese transformer encoders, significantly enhancing text matching performance without increasing inference costs.

Contribution

The paper proposes a novel VIRT method that transfers interaction knowledge into Siamese models via attention map distillation, improving their effectiveness.

Findings

01

VIRT outperforms existing representation-based models on multiple datasets.

02

VIRT can be integrated into various models for further improvements.

03

No extra inference cost is introduced by VIRT.

Abstract

With the booming of pre-trained transformers, representation-based models based on Siamese transformer encoders have become mainstream techniques for efficient text matching. However, these models suffer from severe performance degradation due to the lack of interaction between the text pair, compared with interaction-based models. Prior arts attempt to address this through performing extra interaction for Siamese encoded representations, while the interaction during encoding is still ignored. To remedy this, we propose a \textit{Virtual} InteRacTion mechanism (VIRT) to transfer interactive knowledge from interaction-based models into Siamese encoders through attention map distillation. As a train-time-only component, VIRT could completely maintain the high efficiency of the Siamese structure and brings no extra computation cost during inference. To fully utilize the learned interactive…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Multimodal Machine Learning Applications · Natural Language Processing Techniques