A Light Sliding-Window Part-of-Speech Tagger for the Apertium Free/Open-Source Machine Translation Platform
Gang Chen, Mikel L. Forcada

TL;DR
This paper presents a lightweight, open-source sliding-window part-of-speech tagger integrated into the Apertium translation platform, featuring a new rule incorporation method and performance comparisons.
Contribution
It introduces a novel method for integrating linguistic rules into a sliding-window POS tagger within Apertium, enhancing tagging flexibility and performance.
Findings
The tagger's performance varies with window size and rule use.
Incorporating linguistic rules improves tagging accuracy.
Compared to traditional HMM taggers, the new method offers competitive results.
Abstract
This paper describes a free/open-source implementation of the light sliding-window (LSW) part-of-speech tagger for the Apertium free/open-source machine translation platform. Firstly, the mechanism and training process of the tagger are reviewed, and a new method for incorporating linguistic rules is proposed. Secondly, experiments are conducted to compare the performances of the tagger under different window settings, with or without Apertium-style "forbid" rules, with or without Constraint Grammar, and also with respect to the traditional HMM tagger in Apertium.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Semantic Web and Ontologies
