AutoExtend: Extending Word Embeddings to Embeddings for Synsets and   Lexemes

Sascha Rothe; Hinrich Sch\"utze

arXiv:1507.01127·cs.CL·August 10, 2022

AutoExtend: Extending Word Embeddings to Embeddings for Synsets and Lexemes

Sascha Rothe, Hinrich Sch\"utze

PDF

TL;DR

AutoExtend is a flexible system that extends existing word embeddings to include synsets and lexemes, maintaining the same vector space and achieving state-of-the-art results in word similarity and disambiguation tasks.

Contribution

It introduces a novel tensor-based method to derive embeddings for synsets and lexemes from any pre-existing word embeddings without additional training data.

Findings

01

Achieves state-of-the-art performance on word similarity tasks.

02

Effectively extends embeddings to synsets and lexemes.

03

Compatible with resources like WordNet and Freebase.

Abstract

We present \textit{AutoExtend}, a system to learn embeddings for synsets and lexemes. It is flexible in that it can take any word embeddings as input and does not need an additional training corpus. The synset/lexeme embeddings obtained live in the same vector space as the word embeddings. A sparse tensor formalization guarantees efficiency and parallelizability. We use WordNet as a lexical resource, but AutoExtend can be easily applied to other resources like Freebase. AutoExtend achieves state-of-the-art performance on word similarity and word sense disambiguation tasks.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.