Co-training an Unsupervised Constituency Parser with Weak Supervision

Nickil Maveli; Shay B. Cohen

arXiv:2110.02283·cs.CL·March 22, 2022

Co-training an Unsupervised Constituency Parser with Weak Supervision

Nickil Maveli, Shay B. Cohen

PDF

Open Access 1 Repo

TL;DR

This paper presents a novel unsupervised parsing method that uses co-training of inside and outside classifiers with weak supervision, achieving state-of-the-art results across multiple languages.

Contribution

It introduces a co-training approach with weak supervision and seed bootstrapping for unsupervised constituency parsing, improving accuracy and generalization.

Findings

01

Achieved 63.1 F1 on English PTB test set.

02

Set new state-of-the-art results on Chinese and Japanese treebanks.

03

Demonstrated effectiveness of weak supervision with prior linguistic knowledge.

Abstract

We introduce a method for unsupervised parsing that relies on bootstrapping classifiers to identify if a node dominates a specific span in a sentence. There are two types of classifiers, an inside classifier that acts on a span, and an outside classifier that acts on everything outside of a given span. Through self-training and co-training with the two classifiers, we show that the interplay between them helps improve the accuracy of both, and as a result, effectively parse. A seed bootstrapping technique prepares the data to train these classifiers. Our analyses further validate that such an approach in conjunction with weak supervision using prior branching knowledge of a known language (left/right-branching) and minimal heuristics injects strong inductive bias into the parser, achieving 63.1 F $_{1}$ on the English (PTB) test set. In addition, we show the effectiveness of our…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Nickil21/weakly-supervised-parsing
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Text Readability and Simplification

MethodsTest