VIRT: Improving Representation-based Models for Text Matching through Virtual Interaction
Dan Li, Yang Yang, Hongyin Tang, Jingang Wang, Tong Xu, Wei Wu, Enhong, Chen

TL;DR
VIRT introduces a virtual interaction mechanism that distills interactive knowledge into Siamese transformer encoders, significantly enhancing text matching performance without increasing inference costs.
Contribution
The paper proposes a novel VIRT method that transfers interaction knowledge into Siamese models via attention map distillation, improving their effectiveness.
Findings
VIRT outperforms existing representation-based models on multiple datasets.
VIRT can be integrated into various models for further improvements.
No extra inference cost is introduced by VIRT.
Abstract
With the booming of pre-trained transformers, representation-based models based on Siamese transformer encoders have become mainstream techniques for efficient text matching. However, these models suffer from severe performance degradation due to the lack of interaction between the text pair, compared with interaction-based models. Prior arts attempt to address this through performing extra interaction for Siamese encoded representations, while the interaction during encoding is still ignored. To remedy this, we propose a \textit{Virtual} InteRacTion mechanism (VIRT) to transfer interactive knowledge from interaction-based models into Siamese encoders through attention map distillation. As a train-time-only component, VIRT could completely maintain the high efficiency of the Siamese structure and brings no extra computation cost during inference. To fully utilize the learned interactive…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Multimodal Machine Learning Applications · Natural Language Processing Techniques
