Memory-Based Learning: Using Similarity for Smoothing

Jakub Zavrel; Walter Daelemans

arXiv:cmp-lg/9705010·cmp-lg·February 3, 2008·33 cites

Memory-Based Learning: Using Similarity for Smoothing

Jakub Zavrel, Walter Daelemans

PDF

Open Access

TL;DR

This paper explores the connection between similarity-based Memory-Based Learning and backed-off smoothing, demonstrating how feature weighting can improve language modeling tasks like PP-attachment and POS-tagging.

Contribution

It introduces a feature weighting approach that automatically determines domain-specific hierarchies, enhancing Memory-Based Learning for language processing.

Findings

01

Achieved state-of-the-art results in PP-attachment

02

Achieved state-of-the-art results in POS-tagging

03

Facilitated integration of diverse information sources

Abstract

This paper analyses the relation between the use of similarity in Memory-Based Learning and the notion of backed-off smoothing in statistical language modeling. We show that the two approaches are closely related, and we argue that feature weighting methods in the Memory-Based paradigm can offer the advantage of automatically specifying a suitable domain-specific hierarchy between most specific and most general conditioning information without the need for a large number of parameters. We report two applications of this approach: PP-attachment and POS-tagging. Our method achieves state-of-the-art performance in both domains, and allows the easy integration of diverse information sources, such as rich lexical representations.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Machine Learning and Data Classification