Learning Word Association Norms Using Tree Cut Pair Models

Naoki Abe; Hang Li (Theory NEC Lab.; RWCP)

arXiv:cmp-lg/9605029·cmp-lg·February 3, 2008·43 cites

Learning Word Association Norms Using Tree Cut Pair Models

Naoki Abe, Hang Li (Theory NEC Lab., RWCP)

PDF

Open Access

TL;DR

This paper introduces a novel two-step MDL-based method for learning word association norms within hierarchical classifications, improving case-frame pattern acquisition and disambiguation performance.

Contribution

It proposes a new MDL-based framework for estimating association norms using tree cut models, enhancing the learning of word co-occurrence patterns in hierarchical domains.

Findings

01

Method outperforms existing approaches in acquiring case-frame patterns.

02

Improves disambiguation accuracy using learned association norms.

03

Efficient algorithm for tree cut model estimation.

Abstract

We consider the problem of learning co-occurrence information between two word categories, or more in general between two discrete random variables taking values in a hierarchically classified domain. In particular, we consider the problem of learning the `association norm' defined by A(x,y)=p(x, y)/(p(x)*p(y)), where p(x, y) is the joint distribution for x and y and p(x) and p(y) are marginal distributions induced by p(x, y). We formulate this problem as a sub-task of learning the conditional distribution p(x|y), by exploiting the identity p(x|y) = A(x,y)*p(x). We propose a two-step estimation method based on the MDL principle, which works as follows: It first estimates p(x) as p1 using MDL, and then estimates p(x|y) for a fixed y by applying MDL on the hypothesis class of {A * p1 | A \in B} for some given class B of representations for association norm. The estimation of A is…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Text and Document Classification Technologies