Model Transfer for Tagging Low-resource Languages using a Bilingual   Dictionary

Meng Fang; Trevor Cohn

arXiv:1705.00424·cs.CL·May 2, 2017·6 cites

Model Transfer for Tagging Low-resource Languages using a Bilingual Dictionary

Meng Fang, Trevor Cohn

PDF

Open Access 1 Repo

TL;DR

This paper introduces a neural network model that leverages bilingual dictionaries and cross-lingual embeddings for low-resource language tagging, achieving significant improvements over existing methods.

Contribution

It presents a novel neural model trained on bilingual dictionaries instead of parallel corpora, enhancing low-resource language tagging performance.

Findings

01

Substantial empirical improvements over baseline methods.

02

Active learning heuristics further improve performance.

03

Effective use of bilingual dictionaries for cross-lingual transfer.

Abstract

Cross-lingual model transfer is a compelling and popular method for predicting annotations in a low-resource language, whereby parallel corpora provide a bridge to a high-resource language and its associated annotated corpora. However, parallel data is not readily available for many languages, limiting the applicability of these approaches. We address these drawbacks in our framework which takes advantage of cross-lingual word embeddings trained solely on a high coverage bilingual dictionary. We propose a novel neural network model for joint training from both sources of data based on cross-lingual word embeddings, and show substantial empirical improvements over baseline techniques. We also propose several active learning heuristics, which result in improvements over competitive benchmark methods.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

mengf1/trpos
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Speech Recognition and Synthesis