Phrase-BERT: Improved Phrase Embeddings from BERT with an Application to   Corpus Exploration

Shufan Wang; Laure Thompson; Mohit Iyyer

arXiv:2109.06304·cs.CL·October 15, 2021·1 cites

Phrase-BERT: Improved Phrase Embeddings from BERT with an Application to Corpus Exploration

Shufan Wang, Laure Thompson, Mohit Iyyer

PDF

Open Access 2 Repos 1 Models

TL;DR

This paper introduces Phrase-BERT, a fine-tuned BERT model that generates more meaningful phrase embeddings, improving phrase similarity tasks and enabling effective phrase-based topic modeling.

Contribution

We propose a contrastive fine-tuning method for BERT using paraphrase datasets, enhancing phrase embeddings for better semantic and compositional understanding.

Findings

01

Outperforms baseline models on phrase similarity tasks

02

Increases lexical diversity in embedding space

03

Enables phrase-based neural topic modeling with improved coherence

Abstract

Phrase representations derived from BERT often do not exhibit complex phrasal compositionality, as the model relies instead on lexical similarity to determine semantic relatedness. In this paper, we propose a contrastive fine-tuning objective that enables BERT to produce more powerful phrase embeddings. Our approach (Phrase-BERT) relies on a dataset of diverse phrasal paraphrases, which is automatically generated using a paraphrase generation model, as well as a large-scale dataset of phrases in context mined from the Books3 corpus. Phrase-BERT outperforms baselines across a variety of phrase-level similarity tasks, while also demonstrating increased lexical diversity between nearest neighbors in the vector space. Finally, as a case study, we show that Phrase-BERT embeddings can be easily integrated with a simple autoencoder to build a phrase-based neural topic model that interprets…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Models

🤗
whaleloops/phrase-bert
model· 2.5k dl· ♡ 22
2.5k dl♡ 22

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Advanced Text Analysis Techniques

MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Linear Warmup With Linear Decay · Weight Decay · Attention Dropout · Dropout · Layer Normalization · Softmax · Residual Connection