Does Dependency Locality Predict Non-canonical Word Order in Hindi?
Sidharth Ranjan, Marten van Schijndel

TL;DR
This study investigates whether dependency length minimization influences non-canonical word order in Hindi, finding that discourse predictability and information status are more significant factors than dependency length alone.
Contribution
The paper introduces a classifier that assesses the influence of cognitive and discourse features on Hindi word order, highlighting the primacy of surprisal and givenness over dependency length.
Findings
Dependency length influences non-canonical order but is not the main predictor.
Surprisal and givenness are stronger determinants of word order preferences.
Human evaluations support the computational findings.
Abstract
Previous work has shown that isolated non-canonical sentences with Object-before-Subject (OSV) order are initially harder to process than their canonical counterparts with Subject-before-Object (SOV) order. Although this difficulty diminishes with appropriate discourse context, the underlying cognitive factors responsible for alleviating processing challenges in OSV sentences remain a question. In this work, we test the hypothesis that dependency length minimization is a significant predictor of non-canonical (OSV) syntactic choices, especially when controlling for information status such as givenness and surprisal measures. We extract sentences from the Hindi-Urdu Treebank corpus (HUTB) that contain clearly-defined subjects and objects, systematically permute the preverbal constituents of those sentences, and deploy a classifier to distinguish between original corpus sentences and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Language and cultural evolution · Text Readability and Simplification
