Memory-Based Learning: Using Similarity for Smoothing
Jakub Zavrel, Walter Daelemans

TL;DR
This paper explores the connection between similarity-based Memory-Based Learning and backed-off smoothing, demonstrating how feature weighting can improve language modeling tasks like PP-attachment and POS-tagging.
Contribution
It introduces a feature weighting approach that automatically determines domain-specific hierarchies, enhancing Memory-Based Learning for language processing.
Findings
Achieved state-of-the-art results in PP-attachment
Achieved state-of-the-art results in POS-tagging
Facilitated integration of diverse information sources
Abstract
This paper analyses the relation between the use of similarity in Memory-Based Learning and the notion of backed-off smoothing in statistical language modeling. We show that the two approaches are closely related, and we argue that feature weighting methods in the Memory-Based paradigm can offer the advantage of automatically specifying a suitable domain-specific hierarchy between most specific and most general conditioning information without the need for a large number of parameters. We report two applications of this approach: PP-attachment and POS-tagging. Our method achieves state-of-the-art performance in both domains, and allows the easy integration of diverse information sources, such as rich lexical representations.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Machine Learning and Data Classification
