Why Overfitting Isn't Always Bad: Retrofitting Cross-Lingual Word   Embeddings to Dictionaries

Mozhi Zhang; Yoshinari Fujinuma; Michael J. Paul; Jordan Boyd-Graber

arXiv:2005.00524·cs.CL·May 4, 2020·1 cites

Why Overfitting Isn't Always Bad: Retrofitting Cross-Lingual Word Embeddings to Dictionaries

Mozhi Zhang, Yoshinari Fujinuma, Michael J. Paul, Jordan Boyd-Graber

PDF

Open Access

TL;DR

This paper demonstrates that retrofitting cross-lingual word embeddings to training dictionaries can improve downstream task performance, challenging the traditional focus on bilingual lexicon induction accuracy as the main evaluation metric.

Contribution

It introduces a simple retrofitting method that overfits to the training dictionary, enhancing downstream task performance despite lower BLI accuracy, and highlights limitations of BLI as an evaluation.

Findings

01

Retrofitting improves downstream task accuracy.

02

Overfitting to the training dictionary benefits generalization.

03

BLI accuracy may not reflect downstream performance.

Abstract

Cross-lingual word embeddings (CLWE) are often evaluated on bilingual lexicon induction (BLI). Recent CLWE methods use linear projections, which underfit the training dictionary, to generalize on BLI. However, underfitting can hinder generalization to other downstream tasks that rely on words from the training dictionary. We address this limitation by retrofitting CLWE to the training dictionary, which pulls training translation pairs closer in the embedding space and overfits the training dictionary. This simple post-processing step often improves accuracy on two downstream tasks, despite lowering BLI test accuracy. We also retrofit to both the training dictionary and a synthetic dictionary induced from CLWE, which sometimes generalizes even better on downstream tasks. Our results confirm the importance of fully exploiting training dictionary in downstream tasks and explains why BLI is…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Text Readability and Simplification