Discovery of Linguistic Relations Using Lexical Attraction
Deniz Yuret (MIT Artificial Intelligence Laboratory)

TL;DR
This paper introduces lexical attraction models, a new probabilistic framework for representing long-distance word relations, and demonstrates an unsupervised language acquisition system that learns linguistic relations directly from raw text.
Contribution
It proposes a novel class of lexical attraction models formalized with information theory and develops an unsupervised learning program that identifies linguistic relations without prior grammar or lexicon.
Findings
Achieved 60% precision and 50% recall in relation detection on 100 million words.
Successfully identified relations in syntactically ambiguous sentences.
Demonstrated the effectiveness of lexical attraction in language learning.
Abstract
This work has been motivated by two long term goals: to understand how humans learn language and to build programs that can understand language. Using a representation that makes the relevant features explicit is a prerequisite for successful learning and understanding. Therefore, I chose to represent relations between individual words explicitly in my model. Lexical attraction is defined as the likelihood of such relations. I introduce a new class of probabilistic language models named lexical attraction models which can represent long distance relations between words and I formalize this new class of models using information theory. Within the framework of lexical attraction, I developed an unsupervised language acquisition program that learns to identify linguistic relations in a given sentence. The only explicitly represented linguistic knowledge in the program is lexical…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBayesian Modeling and Causal Inference · Topic Modeling · Natural Language Processing Techniques
