The Persian Dependency Treebank Made Universal
Mohammad Sadegh Rasooli, Pegah Safari, Amirsaeid Moloodi, Alireza, Nourian

TL;DR
This paper presents an automatic method to convert the Persian Dependency Treebank into Universal Dependencies, resulting in a larger, more diverse, and more compatible dataset that improves parsing accuracy and transfer learning performance.
Contribution
The authors developed an automatic conversion method for Persian Dependency Treebank to Universal Dependencies, producing a larger, more diverse dataset that enhances parsing accuracy and transfer learning.
Findings
Achieved an 85.2% labeled attachment F-score in supervised parsing.
Produced a Persian Dependency Treebank more compatible with Universal Dependencies than previous versions.
Demonstrated a 2% absolute improvement in transfer parsing accuracy over prior work.
Abstract
We describe an automatic method for converting the Persian Dependency Treebank (Rasooli et al, 2013) to Universal Dependencies. This treebank contains 29107 sentences. Our experiments along with manual linguistic analysis show that our data is more compatible with Universal Dependencies than the Uppsala Persian Universal Dependency Treebank (Seraji et al., 2016), and is larger in size and more diverse in vocabulary. Our data brings in a labeled attachment F-score of 85.2 in supervised parsing. Our delexicalized Persian-to-English parser transfer experiments show that a parsing model trained on our data is ~2% absolutely more accurate than that of Seraji et al. (2016) in terms of labeled attachment score.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Algorithms and Data Compression · Semantic Web and Ontologies
