Aspects of Pattern-Matching in Data-Oriented Parsing

Guy De Pauw

arXiv:cs/0008014·cs.CL·May 23, 2007

Aspects of Pattern-Matching in Data-Oriented Parsing

Guy De Pauw

PDF

Open Access

TL;DR

This paper reinterprets Data-Oriented Parsing as a pattern-matching model focused on maximizing substructure size, which simplifies computation and maintains accuracy by enhancing context sensitivity.

Contribution

It introduces a pattern-matching perspective to DOP, eliminating the need for multiple derivations and enabling more efficient Viterbi-style parsing algorithms.

Findings

01

Pattern-matching approach retains parsing accuracy

02

Eliminates double work in probabilistic derivations

03

Enables efficient Viterbi-style optimization

Abstract

Data-Oriented Parsing (dop) ranks among the best parsing schemes, pairing state-of-the art parsing accuracy to the psycholinguistic insight that larger chunks of syntactic structures are relevant grammatical and probabilistic units. Parsing with the dop-model, however, seems to involve a lot of CPU cycles and a considerable amount of double work, brought on by the concept of multiple derivations, which is necessary for probabilistic processing, but which is not convincingly related to a proper linguistic backbone. It is however possible to re-interpret the dop-model as a pattern-matching model, which tries to maximize the size of the substructures that construct the parse, rather than the probability of the parse. By emphasizing this memory-based aspect of the dop-model, it is possible to do away with multiple derivations, opening up possibilities for efficient Viterbi-style…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques