Learning Meta Word Embeddings by Unsupervised Weighted Concatenation of   Source Embeddings

Danushka Bollegala

arXiv:2204.12386·cs.CL·April 27, 2022·1 cites

Learning Meta Word Embeddings by Unsupervised Weighted Concatenation of Source Embeddings

Danushka Bollegala

PDF

Open Access

TL;DR

This paper introduces unsupervised weighted concatenation methods for meta word embeddings, theoretically analyzing and empirically demonstrating their superiority over previous approaches using multiple benchmarks.

Contribution

It provides a theoretical framework for weighted concatenation as spectrum matching and proposes unsupervised methods to optimize these weights for improved meta-embeddings.

Findings

01

Weighted concatenation aligns with spectrum matching principles

02

Proposed methods outperform previous meta-embedding techniques

03

Achieves better accuracy on multiple benchmark datasets

Abstract

Given multiple source word embeddings learnt using diverse algorithms and lexical resources, meta word embedding learning methods attempt to learn more accurate and wide-coverage word embeddings. Prior work on meta-embedding has repeatedly discovered that simple vector concatenation of the source embeddings to be a competitive baseline. However, it remains unclear as to why and when simple vector concatenation can produce accurate meta-embeddings. We show that weighted concatenation can be seen as a spectrum matching operation between each source embedding and the meta-embedding, minimising the pairwise inner-product loss. Following this theoretical analysis, we propose two \emph{unsupervised} methods to learn the optimal concatenation weights for creating meta-embeddings from a given set of source embeddings. Experimental results on multiple benchmark datasets show that the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Text and Document Classification Technologies