Distributional Clustering of English Words

Fernando Pereira (AT&T Bell Laboratories); Naftali Tishby (Hebrew; University); Lillian Lee (Harvard University)

arXiv:cmp-lg/9408011·cmp-lg·February 3, 2008·3 cites

Distributional Clustering of English Words

Fernando Pereira (AT&T Bell Laboratories), Naftali Tishby (Hebrew, University), Lillian Lee (Harvard University)

PDF

Open Access

TL;DR

This paper presents a hierarchical distributional clustering method for English words using deterministic annealing, which improves word class modeling based on syntactic context and is evaluated on test data.

Contribution

It introduces a novel hierarchical clustering approach using deterministic annealing for distributional word clustering, enhancing class-based language models.

Findings

01

Hierarchical clusters improve word class modeling.

02

Clusters are stable and meaningful across different annealing stages.

03

Models evaluated show good performance on held-out data.

Abstract

We describe and experimentally evaluate a method for automatically clustering words according to their distribution in particular syntactic contexts. Deterministic annealing is used to find lowest distortion sets of clusters. As the annealing parameter increases, existing clusters become unstable and subdivide, yielding a hierarchical ``soft'' clustering of the data. Clusters are used as the basis for class models of word coocurrence, and the models evaluated with respect to held-out test data.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Data Mining Algorithms and Applications · Bayesian Methods and Mixture Models