Mathematical Foundations for a Compositional Distributional Model of Meaning
Bob Coecke, Mehrnoosh Sadrzadeh, Stephen Clark

TL;DR
This paper introduces a mathematical framework combining distributional semantics and grammatical composition using Pregroup algebra, enabling sentence meaning computation from word meanings within a unified vector space.
Contribution
It unifies distributional and compositional semantics through a categorical, diagrammatic approach based on Pregroup algebra, allowing sentence meanings to be derived systematically.
Findings
Sentence meanings are computed within a single vector space.
Inner products enable comparison of sentence meanings.
A Boolean-valued variant aligns with Montague semantics.
Abstract
We propose a mathematical framework for a unification of the distributional theory of meaning in terms of vector space models, and a compositional theory for grammatical types, for which we rely on the algebra of Pregroups, introduced by Lambek. This mathematical framework enables us to compute the meaning of a well-typed sentence from the meanings of its constituents. Concretely, the type reductions of Pregroups are `lifted' to morphisms in a category, a procedure that transforms meanings of constituents into a meaning of the (well-typed) whole. Importantly, meanings of whole sentences live in a single space, independent of the grammatical structure of the sentence. Hence the inner-product can be used to compare meanings of arbitrary sentences, as it is for comparing the meanings of words in the distributional model. The mathematical structure we employ admits a purely diagrammatic…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Rough Sets and Fuzzy Logic
