The Lifted Matrix-Space Model for Semantic Composition

WooJin Chung; Sheng-Fu Wang; and Samuel R. Bowman

arXiv:1711.03602·cs.CL·April 8, 2019

The Lifted Matrix-Space Model for Semantic Composition

WooJin Chung, Sheng-Fu Wang, and Samuel R. Bowman

PDF

2 Repos

TL;DR

The paper introduces the Lifted Matrix-Space model, a novel neural architecture that improves semantic composition in sentence encoding by efficiently using matrix transformations, outperforming previous tree-structured models on multiple NLP benchmarks.

Contribution

It proposes a new model that maps word embeddings to matrices for composition, achieving better performance with fewer parameters than existing approaches.

Findings

01

Outperforms TreeLSTM on multiple NLP benchmarks

02

Uses fewer parameters while maintaining high performance

03

Effectively transmits activations across layers

Abstract

Tree-structured neural network architectures for sentence encoding draw inspiration from the approach to semantic composition generally seen in formal linguistics, and have shown empirical improvements over comparable sequence models by doing so. Moreover, adding multiplicative interaction terms to the composition functions in these models can yield significant further improvements. However, existing compositional approaches that adopt such a powerful composition function scale poorly, with parameter counts exploding as model dimension or vocabulary size grows. We introduce the Lifted Matrix-Space model, which uses a global transformation to map vector word embeddings to matrices, which can then be composed via an operation based on matrix-matrix multiplication. Its composition function effectively transmits a larger number of activations across layers with relatively few model…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.