For the Purpose of Curry: A UD Treebank for Ashokan Prakrit
Adam Farris, Aryaman Arora

TL;DR
This paper introduces the first annotated treebank of Ashokan Prakrit using the Universal Dependencies framework, enabling linguistic analysis and diachronic language change studies of early Indo-Aryan.
Contribution
It provides a novel, linguistically annotated treebank of Ashokan Prakrit, applying UD formalism to an early Middle Indo-Aryan language for the first time.
Findings
Identified linguistic features challenging for annotation
Addressed issues with nominal compounds and participial constructions
Laid groundwork for diachronic Indo-Aryan language studies
Abstract
We present the first linguistically annotated treebank of Ashokan Prakrit, an early Middle Indo-Aryan dialect continuum attested through Emperor Ashoka Maurya's 3rd century BCE rock and pillar edicts. For annotation, we used the multilingual Universal Dependencies (UD) formalism, following recent UD work on Sanskrit and other Indo-Aryan languages. We touch on some interesting linguistic features that posed issues in annotation: regnal names and other nominal compounds, "proto-ergative" participial constructions, and possible grammaticalizations evidenced by sandhi (phonological assimilation across morpheme boundaries). Eventually, we plan for a complete annotation of all attested Ashokan texts, towards the larger goals of improving UD coverage of different diachronic stages of Indo-Aryan and studying language change in Indo-Aryan using computational methods.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Linguistic Variation and Morphology · Syntax, Semantics, Linguistic Variation
