Offline bilingual word vectors, orthogonal transformations and the inverted softmax
Samuel L. Smith, David H. P. Turban, Steven Hamblin, Nils Y., Hammerla

TL;DR
This paper demonstrates that orthogonal transformations improve offline bilingual word vector alignment, introduces an inverted softmax for better translation accuracy, and shows robustness using pseudo-dictionaries, achieving high translation precision.
Contribution
It proves the orthogonality of the transformation, introduces the inverted softmax, and develops a noise-robust method using pseudo-dictionaries for bilingual embedding alignment.
Findings
Improved translation precision from 34% to 43% with the inverted softmax.
Achieved 40% precision using pseudo-dictionaries without expert bilingual signals.
Attained 68% precision in translating English sentences from Italian corpus.
Abstract
Usually bilingual word vectors are trained "online". Mikolov et al. showed they can also be found "offline", whereby two pre-trained embeddings are aligned with a linear transformation, using dictionaries compiled from expert knowledge. In this work, we prove that the linear transformation between two spaces should be orthogonal. This transformation can be obtained using the singular value decomposition. We introduce a novel "inverted softmax" for identifying translation pairs, with which we improve the precision @1 of Mikolov's original mapping from 34% to 43%, when translating a test set composed of both common and rare English words into Italian. Orthogonal transformations are more robust to noise, enabling us to learn the transformation without expert bilingual signal by constructing a "pseudo-dictionary" from the identical character strings which appear in both languages, achieving…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Speech Recognition and Synthesis
