Understanding Exhaustive Pattern Learning
Libin Shen

TL;DR
This paper provides the first theoretical justification for exhaustive pattern learning (EPL) in NLP, showing it approximates an ensemble of models with different segmentations, thus clarifying its practical success despite earlier criticisms.
Contribution
It formalizes EPL and demonstrates that its probability estimate approximates an ensemble method, offering a new theoretical understanding of EPL's effectiveness in NLP.
Findings
EPL probability is a constant-factor approximation of an ensemble model.
Provides the first theoretical justification for EPL in NLP.
Helps clarify the success of EPL despite previous criticisms.
Abstract
Pattern learning in an important problem in Natural Language Processing (NLP). Some exhaustive pattern learning (EPL) methods (Bod, 1992) were proved to be flawed (Johnson, 2002), while similar algorithms (Och and Ney, 2004) showed great advantages on other tasks, such as machine translation. In this article, we first formalize EPL, and then show that the probability given by an EPL model is constant-factor approximation of the probability given by an ensemble method that integrates exponential number of models obtained with various segmentations of the training data. This work for the first time provides theoretical justification for the widely used EPL algorithm in NLP, which was previously viewed as a flawed heuristic method. Better understanding of EPL may lead to improved pattern learning algorithms in future.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Semantic Web and Ontologies
