TL;DR
This paper presents a globally normalized transition-based neural network that outperforms or matches recurrent models in tasks like POS tagging, parsing, and sentence compression, emphasizing the benefits of global normalization.
Contribution
It introduces a simple feed-forward neural network with global normalization for transition-based tasks, addressing label bias and achieving state-of-the-art results.
Findings
Achieves state-of-the-art results in POS tagging, parsing, and sentence compression.
Global normalization improves model expressiveness over local normalization.
Global models outperform locally normalized models due to reduced label bias.
Abstract
We introduce a globally normalized transition-based neural network model that achieves state-of-the-art part-of-speech tagging, dependency parsing and sentence compression results. Our model is a simple feed-forward neural network that operates on a task-specific transition system, yet achieves comparable or better accuracies than recurrent models. We discuss the importance of global as opposed to local normalization: a key insight is that the label bias problem implies that globally normalized models can be strictly more expressive than locally normalized models.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
