TL;DR
This paper identifies structural ambiguity and simplicity bias in unsupervised neural grammar induction, analyzing their origins and proposing a sentence-wise parse focusing method to improve accuracy and interpretability.
Contribution
It introduces a novel sentence-wise parse focusing approach that leverages pre-trained parsers to reduce ambiguity and bias in unsupervised neural grammar induction.
Findings
Significant performance improvements on unsupervised parsing benchmarks
Reduction in prediction variance and bias towards simple parses
Enhanced interpretability of learned grammars
Abstract
Neural parameterization has significantly advanced unsupervised grammar induction. However, training these models with a traditional likelihood loss for all possible parses exacerbates two issues: 1) that arbitrarily selects one among structurally ambiguous optimal grammars despite the specific preference of gold parses, and 2) that leads a model to underutilize rules to compose parse trees. These challenges subject unsupervised neural grammar induction (UNGI) to inevitable prediction errors, high variance, and the necessity for extensive grammars to achieve accurate predictions. This paper tackles these issues, offering a comprehensive analysis of their origins. As a solution, we introduce to reduce the parse pool per sentence for loss evaluation, using the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
