FRAGE: Frequency-Agnostic Word Representation

Chengyue Gong; Di He; Xu Tan; Tao Qin; Liwei Wang; Tie-Yan Liu

arXiv:1809.06858·cs.CL·March 18, 2020·79 cites

FRAGE: Frequency-Agnostic Word Representation

Chengyue Gong, Di He, Xu Tan, Tao Qin, Liwei Wang, Tie-Yan Liu

PDF

Open Access 2 Repos

TL;DR

This paper introduces FRAGE, a frequency-agnostic word embedding method using adversarial training, which improves the effectiveness of word representations, especially for rare words, across multiple NLP tasks.

Contribution

The paper proposes a novel adversarial training approach to learn frequency-agnostic word embeddings, addressing bias towards word frequency in existing embeddings.

Findings

01

FRAGE outperforms baselines in word similarity tasks

02

FRAGE improves performance in language modeling and machine translation

03

FRAGE enhances text classification results

Abstract

Continuous word representation (aka word embedding) is a basic building block in many neural network-based models used in natural language processing tasks. Although it is widely accepted that words with similar semantics should be close to each other in the embedding space, we find that word embeddings learned in several tasks are biased towards word frequency: the embeddings of high-frequency and low-frequency words lie in different subregions of the embedding space, and the embedding of a rare word and a popular word can be far from each other even if they are semantically similar. This makes learned word embeddings ineffective, especially for rare words, and consequently limits the performance of these neural network models. In this paper, we develop a neat, simple yet effective way to learn \emph{FRequency-AGnostic word Embedding} (FRAGE) using adversarial training. We conducted…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Speech Recognition and Synthesis