Cross-Domain Evaluation of POS Taggers: From Wall Street Journal to Fandom Wiki
Kia Kirstein Hansen, Rob van der Goot

TL;DR
This study evaluates the out-of-domain performance of POS taggers trained on Wall Street Journal data by testing them on Elder Scrolls Fandom wiki data, revealing significant drops in accuracy on unknown tokens and challenges with proper nouns.
Contribution
It provides the first analysis of cross-domain POS tagging performance on fine-grained labels using a new dataset from a gaming wiki, highlighting domain transfer challenges.
Findings
Accuracy on known tokens remains high across domains.
Significant accuracy drop on unknown tokens in out-of-domain data.
Proper nouns and capitalization inconsistencies cause tagging difficulties.
Abstract
The Wall Street Journal section of the Penn Treebank has been the de-facto standard for evaluating POS taggers for a long time, and accuracies over 97\% have been reported. However, less is known about out-of-domain tagger performance, especially with fine-grained label sets. Using data from Elder Scrolls Fandom, a wiki about the \textit{Elder Scrolls} video game universe, we create a modest dataset for qualitatively evaluating the cross-domain performance of two POS taggers: the Stanford tagger (Toutanova et al. 2003) and Bilty (Plank et al. 2016), both trained on WSJ. Our analyses show that performance on tokens seen during training is almost as good as in-domain performance, but accuracy on unknown tokens decreases from 90.37% to 78.37% (Stanford) and 87.84\% to 80.41\% (Bilty) across domains. Both taggers struggle with proper nouns and inconsistent capitalization.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Wikis in Education and Collaboration
