Learning K-way D-dimensional Discrete Codes for Compact Embedding   Representations

Ting Chen; Martin Renqiang Min; Yizhou Sun

arXiv:1806.09464·cs.LG·June 26, 2018·41 cites

Learning K-way D-dimensional Discrete Codes for Compact Embedding Representations

Ting Chen, Martin Renqiang Min, Yizhou Sun

PDF

Open Access 1 Repo

TL;DR

This paper introduces a compact K-way D-dimensional discrete encoding scheme for embeddings, significantly reducing parameter size while maintaining or improving performance across NLP and graph applications.

Contribution

It proposes a novel KD encoding method with a relaxed discrete optimization approach for end-to-end learning of meaningful codes, replacing traditional one-hot embeddings.

Findings

01

Embedding size reduced by up to 98%

02

Achieves comparable or better performance

03

Applicable across NLP and graph models

Abstract

Conventional embedding methods directly associate each symbol with a continuous embedding vector, which is equivalent to applying a linear transformation based on a "one-hot" encoding of the discrete symbols. Despite its simplicity, such approach yields the number of parameters that grows linearly with the vocabulary size and can lead to overfitting. In this work, we propose a much more compact K-way D-dimensional discrete encoding scheme to replace the "one-hot" encoding. In the proposed "KD encoding", each symbol is represented by a $D$ -dimensional code with a cardinality of $K$ , and the final symbol embedding vector is generated by composing the code embedding vectors. To end-to-end learn semantically meaningful codes, we derive a relaxed discrete optimization approach based on stochastic gradient descent, which can be generally applied to any differentiable computational graph with…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

chentingpc/kdcode-lm
tf

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Graph Neural Networks · Complex Network Analysis Techniques · Machine Learning and ELM