Learning Constituent Headedness

Zeyao Qi; Yige Chen; KyungTae Lim; Haihua Pan; Jungyeul Park

arXiv:2603.14755·cs.CL·March 17, 2026

Learning Constituent Headedness

Zeyao Qi, Yige Chen, KyungTae Lim, Haihua Pan, Jungyeul Park

PDF

Open Access

TL;DR

This paper introduces an explicit model for constituent headedness, improving syntactic analysis accuracy by learning head predictions directly from aligned constituency and dependency data, outperforming rule-based methods.

Contribution

It presents a supervised approach to predict constituent headedness as an explicit layer, enhancing parsing accuracy and cross-lingual transferability.

Findings

01

Achieves near-ceiling intrinsic accuracy on English and Chinese data.

02

Outperforms Collins-style rule-based percolation significantly.

03

Improves constituency-to-dependency conversion fidelity.

Abstract

Headedness is widely used as an organizing device in syntactic analysis, yet constituency treebanks rarely encode it explicitly and most processing pipelines recover it procedurally via percolation rules. We treat this notion of constituent headedness as an explicit representational layer and learn it as a supervised prediction task over aligned constituency and dependency annotations, inducing supervision by defining each constituent head as the dependency span head. On aligned English and Chinese data, the resulting models achieve near-ceiling intrinsic accuracy and substantially outperform Collins-style rule-based percolation. Predicted heads yield comparable parsing accuracy under head-driven binarization, consistent with the induced binary training targets being largely equivalent across head choices, while increasing the fidelity of deterministic constituency-to-dependency…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Neurobiology of Language and Bilingualism · Topic Modeling