Multi-Unit Directional Measures of Association: Moving Beyond Pairs of Words
Jonathan Dunn

TL;DR
This paper introduces multi-unit directional measures of association that extend pairwise measures to sequences of varying lengths, addressing segmentation challenges and demonstrating cross-linguistic stability and uniqueness.
Contribution
It proposes a vector-based approach with 18 measures to quantify and segment multi-unit associations across different languages and representations.
Findings
Measures are stable across languages
Each measure provides a unique ranking of sequences
Generalizes association analysis beyond pairwise comparisons
Abstract
This paper formulates and evaluates a series of multi-unit measures of directional association, building on the pairwise {\Delta}P measure, that are able to quantify association in sequences of varying length and type of representation. Multi-unit measures face an additional segmentation problem: once the implicit length constraint of pairwise measures is abandoned, association measures must also identify the borders of meaningful sequences. This paper takes a vector-based approach to the segmentation problem by using 18 unique measures to describe different aspects of multi-unit association. An examination of these measures across eight languages shows that they are stable across languages and that each provides a unique rank of associated sequences. Taken together, these measures expand corpus-based approaches to association by generalizing across varying lengths and types of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsCategorization, perception, and language · Syntax, Semantics, Linguistic Variation · Language and cultural evolution
