Dependency distance minimization predicts compression
Ramon Ferrer-i-Cancho, Carlos G\'omez-Rodr\'iguez

TL;DR
This study tests whether dependency distance minimization predicts word length compression, confirming the prediction with a new scoring method when measuring in phonemes, and highlighting limitations of traditional scores.
Contribution
It introduces a new scoring method to test the link between dependency distance minimization and word length compression, confirming the prediction in phonemes but not syllables.
Findings
Confirmation of the prediction when measuring in phonemes
Failure to confirm when measuring in syllables
Highlights limitations of traditional dependency distance scores
Abstract
Dependency distance minimization (DDm) is a well-established principle of word order. It has been predicted theoretically that DDm implies compression, namely the minimization of word lengths. This is a second order prediction because it links a principle with another principle, rather than a principle and a manifestation as in a first order prediction. Here we test that second order prediction with a parallel collection of treebanks controlling for annotation style with Universal Dependencies and Surface-Syntactic Universal Dependencies. To test it, we use a recently introduced score that has many mathematical and statistical advantages with respect to the widely used sum of dependency distances. We find that the prediction is confirmed by the new score when word lengths are measured in phonemes, independently of the annotation style, but not when word lengths are measured in…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Language and cultural evolution · Authorship Attribution and Profiling
MethodsTest
