Imputing Out-of-Vocabulary Embeddings with LOVE Makes Language Models   Robust with Little Cost

Lihu Chen; Ga\"el Varoquaux; Fabian M. Suchanek

arXiv:2203.07860·cs.CL·March 22, 2022

Imputing Out-of-Vocabulary Embeddings with LOVE Makes Language Models Robust with Little Cost

Lihu Chen, Ga\"el Varoquaux, Fabian M. Suchanek

PDF

Open Access 1 Repo

TL;DR

The paper introduces LOVE, a contrastive learning framework that enhances language models' robustness to out-of-vocabulary words by generating embeddings from surface forms, with minimal additional parameters.

Contribution

LOVE is a simple, lightweight method that extends pre-trained models to handle OOV words effectively, outperforming prior approaches in robustness and compatibility.

Findings

01

LOVE achieves comparable or better performance than existing methods.

02

It significantly improves robustness of BERT and FastText to OOV words.

03

The approach requires few additional parameters and is plug-and-play compatible.

Abstract

State-of-the-art NLP systems represent inputs with word embeddings, but these are brittle when faced with Out-of-Vocabulary (OOV) words. To address this issue, we follow the principle of mimick-like models to generate vectors for unseen words, by learning the behavior of pre-trained embeddings using only the surface form of words. We present a simple contrastive learning framework, LOVE, which extends the word representation of an existing pre-trained language model (such as BERT), and makes it robust to OOV with few additional parameters. Extensive evaluations demonstrate that our lightweight model achieves similar or even better performances than prior competitors, both on original datasets and on corrupted variants. Moreover, it can be used in a plug-and-play fashion with FastText and BERT, where it significantly improves their robustness.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

tigerchen52/love
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Multimodal Machine Learning Applications · Natural Language Processing Techniques

MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Multi-Head Attention · Attention Is All You Need · Linear Layer · Contrastive Learning · Residual Connection · Attention Dropout · Weight Decay · Layer Normalization · Linear Warmup With Linear Decay