CPTAM: Constituency Parse Tree Aggregation Method

Adithya Kulkarni; Nasim Sabetpour; Alexey Markin; Oliver Eulenstein,; Qi Li

arXiv:2201.07905·cs.CL·July 4, 2023

CPTAM: Constituency Parse Tree Aggregation Method

Adithya Kulkarni, Nasim Sabetpour, Alexey Markin, Oliver Eulenstein,, Qi Li

PDF

Open Access 1 Repo

TL;DR

CPTAM introduces a novel truth discovery approach to aggregate constituency parse trees from multiple parsers, improving accuracy and reliability without needing ground truth, across various languages and domains.

Contribution

This paper presents the first truth discovery method for tree structures, specifically for constituency parse tree aggregation, using RF distance minimization.

Findings

01

CPTAM outperforms existing aggregation baselines on benchmark datasets.

02

The estimated weights effectively evaluate parser reliability without ground truth.

03

The method improves parse quality across multiple languages and domains.

Abstract

Diverse Natural Language Processing tasks employ constituency parsing to understand the syntactic structure of a sentence according to a phrase structure grammar. Many state-of-the-art constituency parsers are proposed, but they may provide different results for the same sentences, especially for corpora outside their training domains. This paper adopts the truth discovery idea to aggregate constituency parse trees from different parsers by estimating their reliability in the absence of ground truth. Our goal is to consistently obtain high-quality aggregated constituency parse trees. We formulate the constituency parse tree aggregation problem in two steps, structure aggregation and constituent label aggregation. Specifically, we propose the first truth discovery solution for tree structures by minimizing the weighted sum of Robinson-Foulds (RF) distances, a classic symmetric distance…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

kulkarniadithya/cptam
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Text and Document Classification Technologies