Text Chunking using Transformation-Based Learning

Lance A. Ramshaw (Bowdoin College); Mitchell P. Marcus (University; of Pennsylvania)

arXiv:cmp-lg/9505040·cmp-lg·September 25, 2009·474 cites

Text Chunking using Transformation-Based Learning

Lance A. Ramshaw (Bowdoin College), Mitchell P. Marcus (University, of Pennsylvania)

PDF

Open Access

TL;DR

This paper applies transformation-based learning to text chunking, achieving high accuracy in identifying base noun phrase chunks and more complex structures, demonstrating its effectiveness beyond part-of-speech tagging.

Contribution

It introduces a novel application of transformation-based learning for text chunking, encoding chunk structure as tags and achieving high precision and recall.

Findings

01

92% recall and precision for baseNP chunks

02

88% accuracy for complex chunks

03

Effective adaptation of transformation-based learning for chunking

Abstract

Eric Brill introduced transformation-based learning and showed that it can do part-of-speech tagging with fairly high accuracy. The same method can be applied at a higher level of textual interpretation for locating chunks in the tagged text, including non-recursive ``baseNP'' chunks. For this purpose, it is convenient to view chunking as a tagging problem by encoding the chunk structure in new tags attached to each word. In automatic tests using Treebank-derived data, this technique achieved recall and precision rates of roughly 92% for baseNP chunks and 88% for somewhat more complex chunks that partition the sentence. Some interesting adaptations to the transformation-based learning approach are also suggested by this application.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Algorithms and Data Compression