The Mechanism of Additive Composition

Ran Tian; Naoaki Okazaki; Kentaro Inui

arXiv:1511.08407·cs.CL·April 5, 2017

The Mechanism of Additive Composition

Ran Tian, Naoaki Okazaki, Kentaro Inui

PDF

TL;DR

This paper provides the first theoretical analysis of additive composition in word vector models, establishing an upper bound on bias based on collocation strength and suggesting improvements for phrase meaning approximation.

Contribution

It introduces a formal bias bound for additive composition, connecting it to collocation strength and natural language data properties, with implications for enhancing compositional models.

Findings

01

Bias bound depends on collocation strength

02

Additive composition accuracy improves with word collocation

03

Proposes methods to enhance additive compositionality

Abstract

Additive composition (Foltz et al, 1998; Landauer and Dumais, 1997; Mitchell and Lapata, 2010) is a widely used method for computing meanings of phrases, which takes the average of vector representations of the constituent words. In this article, we prove an upper bound for the bias of additive composition, which is the first theoretical analysis on compositional frameworks from a machine learning point of view. The bound is written in terms of collocation strength; we prove that the more exclusively two successive words tend to occur together, the more accurate one can guarantee their additive composition as an approximation to the natural phrase vector. Our proof relies on properties of natural language data that are empirically verified, and can be theoretically derived from an assumption that the data is generated from a Hierarchical Pitman-Yor Process. The theory endorses additive…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.