Improving Tagging Performance by Using Voting Taggers
L. Marquez, L. Padro & H. Rodriguez (Universitat Politecnica de, Catalunya)

TL;DR
This paper introduces a bootstrapping approach that leverages the agreement of two POS taggers to improve tagging accuracy and develop annotated corpora for resource-scarce languages, demonstrated on Spanish.
Contribution
It proposes a novel collaborative voting method for POS tagging that enhances accuracy and aids corpus creation in low-resource language settings.
Findings
Higher accuracy in agreed-upon tags
Effective corpus development for Spanish
Improved tagging performance through voting
Abstract
We present a bootstrapping method to develop an annotated corpus, which is specially useful for languages with few available resources. The method is being applied to develop a corpus of Spanish of over 5Mw. The method consists on taking advantage of the collaboration of two different POS taggers. The cases in which both taggers agree present a higher accuracy and are used to retrain the taggers.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Algorithms and Data Compression · Speech and dialogue systems
