GWPT: A Green Word-Embedding-based POS Tagger
Chengwei Wei, Runqi Pang, C.-C. Jay Kuo

TL;DR
GWPT is a lightweight, efficient POS tagging method utilizing novel embedding techniques that achieves high accuracy with fewer resources, suitable for resource-constrained NLP applications.
Contribution
This work introduces GWPT, a green learning-based POS tagger that combines innovative embedding partitioning with N-gram representations, reducing complexity while maintaining accuracy.
Findings
Achieves state-of-the-art accuracy among lightweight POS taggers.
Uses significantly fewer parameters and lower computational complexity.
Effective with both non-contextual and contextual embeddings.
Abstract
As a fundamental tool for natural language processing (NLP), the part-of-speech (POS) tagger assigns the POS label to each word in a sentence. A novel lightweight POS tagger based on word embeddings is proposed and named GWPT (green word-embedding-based POS tagger) in this work. Following the green learning (GL) methodology, GWPT contains three modules in cascade: 1) representation learning, 2) feature learning, and 3) decision learning modules. The main novelty of GWPT lies in representation learning. It uses non-contextual or contextual word embeddings, partitions embedding dimension indices into low-, medium-, and high-frequency sets, and represents them with different N-grams. It is shown by experimental results that GWPT offers state-of-the-art accuracies with fewer model parameters and significantly lower computational complexity in both training and inference as compared with…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Text and Document Classification Technologies
