Learning Probabilistic Models of Word Sense Disambiguation
Ted Pedersen

TL;DR
This paper introduces new supervised and unsupervised probabilistic methods for word sense disambiguation, highlighting the effectiveness of Naive Bayesian models through theoretical analysis and empirical evaluation.
Contribution
It presents novel model search techniques and applies Gibbs Sampling and EM algorithms to improve word sense disambiguation, with insights into Naive Bayesian model success.
Findings
Naive Bayesian models perform well in both supervised and unsupervised settings.
Gibbs Sampling and EM algorithms are effective for unsupervised learning.
Theoretical explanations for Naive Bayesian success are provided.
Abstract
This dissertation presents several new methods of supervised and unsupervised learning of word sense disambiguation models. The supervised methods focus on performing model searches through a space of probabilistic models, and the unsupervised methods rely on the use of Gibbs Sampling and the Expectation Maximization (EM) algorithm. In both the supervised and unsupervised case, the Naive Bayesian model is found to perform well. An explanation for this success is presented in terms of learning rates and bias-variance decompositions.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Bayesian Modeling and Causal Inference
