TL;DR
This paper refines Earley's parsing algorithm with semiring weights, achieving improved worst-case runtimes and efficient implementation details suitable for large, weighted grammars in natural language processing.
Contribution
It introduces a deduction system for Earley's algorithm with speed-ups, including a novel finite-state automaton representation for improved runtime, and extends to semiring-weighted parsing with practical implementation insights.
Findings
Achieves worst-case runtime of O(N^3|G|) for large grammars.
Provides a semiring-weighted deduction framework for Earley's algorithm.
Ensures efficient implementation with asymptotic runtime comparable to unweighted methods.
Abstract
This paper provides a reference description, in the form of a deduction system, of Earley's (1970) context-free parsing algorithm with various speed-ups. Our presentation includes a known worst-case runtime improvement from Earley's , which is unworkable for the large grammars that arise in natural language processing, to , which matches the runtime of CKY on a binarized version of the grammar . Here is the length of the sentence, is the number of productions in , and is the total length of those productions. We also provide a version that achieves runtime of with when the grammar is represented compactly as a single finite-state automaton (this is partly novel). We carefully treat the generalization to semiring-weighted deduction, preprocessing the grammar like Stolcke (1995) to eliminate deduction cycles,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
