Exploring transfer learning for Deep NLP systems on rarely annotated languages
Dipendra Yadav, Tobias Strau{\ss}, Kristina Yordanova

TL;DR
This paper explores transfer learning for POS tagging in low-resource languages, demonstrating that joint training of similar languages like Hindi and Nepali improves deep learning model performance despite limited annotated data.
Contribution
It introduces a transfer learning approach using joint training of similar languages and auxiliary tasks to enhance POS tagging accuracy in under-resourced languages.
Findings
Jointly trained Hindi-Nepali embeddings outperform monolingual embeddings.
Multitask learning with auxiliary tasks improves POS tagging accuracy.
Different training configurations affect model performance and robustness.
Abstract
Natural language processing (NLP) has experienced rapid advancements with the rise of deep learning, significantly outperforming traditional rule-based methods. By capturing hidden patterns and underlying structures within data, deep learning has improved performance across various NLP tasks, overcoming the limitations of rule-based systems. However, most research and development in NLP has been concentrated on a select few languages, primarily those with large numbers of speakers or financial significance, leaving many others underexplored. This lack of research is often attributed to the scarcity of adequately annotated datasets essential for training deep learning models. Despite this challenge, there is potential in leveraging the linguistic similarities between unexplored and well-studied languages, particularly those in close geographic and linguistic proximity. This thesis…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling
MethodsDropout
