The expected sum of edge lengths in planar linearizations of trees. Theory and applications
Llu\'is Alemany-Puig, Ramon Ferrer-i-Cancho

TL;DR
This paper analyzes the expected sum of edge lengths in planar linearizations of dependency trees, providing theoretical characterizations, efficient algorithms, and applications to linguistic data to better understand dependency distance minimization.
Contribution
It introduces a characterization of planarity, an efficient algorithm for expected edge length calculation, and explores the relationship between planarity constraints and dependency distances in language.
Findings
Expected sum in planar arrangements relates to projective arrangements.
Algorithm computes expected edge length in O(n) time.
Dependency distance decreases with stronger formal constraints.
Abstract
Dependency trees have proven to be a very successful model to represent the syntactic structure of sentences of human languages. In these structures, vertices are words and edges connect syntactically-dependent words. The tendency of these dependencies to be short has been demonstrated using random baselines for the sum of the lengths of the edges or its variants. A ubiquitous baseline is the expected sum in projective orderings (wherein edges do not cross and the root word of the sentence is not covered by any edge), that can be computed in time . Here we focus on a weaker formal constraint, namely planarity. In the theoretical domain, we present a characterization of planarity that, given a sentence, yields either the number of planar permutations or an efficient algorithm to generate uniformly random planar permutations of the words. We also show the relationship between the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Linguistic Variation and Morphology · Genomics and Chromatin Dynamics
