A Corpus-based Toy Model for DisCoCat
Stefano Gogioso (University of Oxford)

TL;DR
This paper presents a concrete mapping from a toy syntactic model to categorical semantics within the DisCoCat framework, bridging syntax and semantics in computational linguistics.
Contribution
It introduces a specific construction of the syntax-to-semantics mapping using free R-semimodules over an involutive commutative semiring, advancing the DisCoCat paradigm.
Findings
Constructed a syntax-to-semantics mapping in a categorical setting
Applied the model to a toy corpus with constituent structure trees
Demonstrated the feasibility of categorical semantics in this framework
Abstract
The categorical compositional distributional (DisCoCat) model of meaning rigorously connects distributional semantics and pregroup grammars, and has found a variety of applications in computational linguistics. From a more abstract standpoint, the DisCoCat paradigm predicates the construction of a mapping from syntax to categorical semantics. In this work we present a concrete construction of one such mapping, from a toy model of syntax for corpora annotated with constituent structure trees, to categorical semantics taking place in a category of free R-semimodules over an involutive commutative semiring R.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
