Compressing Sentence Representation for Semantic Retrieval via   Homomorphic Projective Distillation

Xuandong Zhao; Zhiguo Yu; Ming Wu; Lei Li

arXiv:2203.07687·cs.CL·March 16, 2022

Compressing Sentence Representation for Semantic Retrieval via Homomorphic Projective Distillation

Xuandong Zhao, Zhiguo Yu, Ming Wu, Lei Li

PDF

1 Repo 2 Models

TL;DR

This paper introduces Homomorphic Projective Distillation, a method to create compact sentence embeddings that retain high quality, significantly improving efficiency and performance in semantic retrieval and similarity tasks.

Contribution

The paper proposes a novel distillation technique that enables small models to mimic large pre-trained models, producing high-quality, compressed sentence representations.

Findings

01

Achieves 2.7-4.5 points improvement on STS tasks.

02

Enhances retrieval speed by 8.2 times.

03

Reduces memory usage by 8 times.

Abstract

How to learn highly compact yet effective sentence representation? Pre-trained language models have been effective in many NLP tasks. However, these models are often huge and produce large sentence embeddings. Moreover, there is a big performance gap between large and small models. In this paper, we propose Homomorphic Projective Distillation (HPD) to learn compressed sentence embeddings. Our method augments a small Transformer encoder model with learnable projection layers to produce compact representations while mimicking a large pre-trained language model to retain the sentence representation quality. We evaluate our method with different model sizes on both semantic textual similarity (STS) and semantic retrieval (SR) tasks. Experiments show that our method achieves 2.7-4.5 points performance gain on STS tasks compared with previous best representations of the same size. In SR…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

xuandongzhao/hpd
pytorchOfficial

Models

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · SPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings · Byte Pair Encoding · Softmax · Dense Connections · Residual Connection · Dropout · Layer Normalization