Efficiently Inducing Features of Conditional Random Fields

Andrew McCallum

arXiv:1212.2504·cs.LG·December 12, 2012·366 cites

Efficiently Inducing Features of Conditional Random Fields

Andrew McCallum

PDF

Open Access

TL;DR

This paper introduces a feature induction method for Conditional Random Fields that automatically selects relevant features, improving accuracy and reducing feature count, applicable to various CRF structures and demonstrated on named entity extraction.

Contribution

It presents a novel feature induction approach for CRFs that enhances model performance and efficiency by selecting only significant feature conjunctions, adaptable to different CRF structures.

Findings

01

Improved accuracy over traditional methods

02

Significant reduction in feature count

03

Effective on named entity extraction task

Abstract

Conditional Random Fields (CRFs) are undirected graphical models, a special case of which correspond to conditionally-trained finite state machines. A key advantage of these models is their great flexibility to include a wide array of overlapping, multi-granularity, non-independent features of the input. In face of this freedom, an important question that remains is, what features should be used? This paper presents a feature induction method for CRFs. Founded on the principle of constructing only those feature conjunctions that significantly increase log-likelihood, the approach is based on that of Della Pietra et al [1997], but altered to work with conditional rather than joint probabilities, and with additional modifications for providing tractability specifically for a sequence model. In comparison with traditional approaches, automated feature induction offers both improved…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Data Quality and Management