Unsupervised Cross-Lingual Transfer of Structured Predictors without   Source Data

Kemal Kurniawan; Lea Frermann; Philip Schulz; Trevor Cohn

arXiv:2110.03866·cs.CL·October 11, 2021

Unsupervised Cross-Lingual Transfer of Structured Predictors without Source Data

Kemal Kurniawan, Lea Frermann, Philip Schulz, Trevor Cohn

PDF

Open Access 1 Repo

TL;DR

This paper introduces an unsupervised cross-lingual transfer method for structured prediction that effectively aggregates multiple models without requiring source data, improving label quality across 18 languages.

Contribution

It generalizes transfer methods to multiple input models and demonstrates that multiplying marginal probabilities yields better structures than union-based approaches.

Findings

01

Effective transfer across 18 languages.

02

Multiplying marginal probabilities improves label quality.

03

Less noisy labels for distant supervision.

Abstract

Providing technologies to communities or domains where training data is scarce or protected e.g., for privacy reasons, is becoming increasingly important. To that end, we generalise methods for unsupervised transfer from multiple input models for structured prediction. We show that the means of aggregating over the input models is critical, and that multiplying marginal probabilities of substructures to obtain high-probability structures for distant supervision is substantially better than taking the union of such structures over the input models, as done in prior work. Testing on 18 languages, we demonstrate that the method works in a cross-lingual setting, considering both dependency parsing and part-of-speech structured prediction problems. Our analyses show that the proposed method produces less noisy labels for the distant supervision.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

kmkurn/uxtspwsd
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Speech Recognition and Synthesis