POS tagging, lemmatization and dependency parsing of West Frisian
Wilbert Heeringa, Gosse Bouma, Martha Hofman, Eduard Drenth, Jan, Wijffels, Hans Van de Velde

TL;DR
This paper introduces a comprehensive NLP pipeline for West Frisian, including POS tagging, lemmatization, and dependency parsing, built on a newly annotated corpus and leveraging Dutch translation techniques.
Contribution
It presents the first integrated lemmatizer, POS-tagger, and dependency parser for West Frisian, utilizing a novel annotated corpus and translation-based annotation methods.
Findings
Significant improvement in lemmatization accuracy.
Effective use of Dutch translation for annotation.
The tools are available as a web app and web service.
Abstract
We present a lemmatizer/POS-tagger/dependency parser for West Frisian using a corpus of 44,714 words in 3,126 sentences that were annotated according to the guidelines of Universal Dependency version 2. POS tags were assigned to words by using a Dutch POS tagger that was applied to a literal word-by-word translation, or to sentences of a Dutch parallel text. Best results were obtained when using literal translations that were created by using the Frisian translation program Oersetter. Morphologic and syntactic annotations were generated on the basis of a literal Dutch translation as well. The performance of the lemmatizer/tagger/annotator when it was trained using default parameters was compared to the performance that was obtained when using the parameter values that were used for training the LassySmall UD 2.5 corpus. A significant improvement was found for `lemma'. The Frisian…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Linguistics and language evolution · Lexicography and Language Studies
Methodstravel james
