Linguistic Structure as Composition and Perturbation
Carl de Marcken (MIT Artificial Intelligence Lab.)

TL;DR
This paper proposes a novel linguistic representation framework where words are formed by perturbing compositions of existing parameters, demonstrated through applications in segmentation, compression, and lexicon acquisition from raw speech.
Contribution
It introduces a new approach to language learning based on perturbation of composed parameters, advancing methods for lexicon acquisition and text-speech mapping.
Findings
Effective in text segmentation and compression
Successful in acquiring lexicons from raw speech
Demonstrates mappings between text and artificial meaning representations
Abstract
This paper discusses the problem of learning language from unprocessed text and speech signals, concentrating on the problem of learning a lexicon. In particular, it argues for a representation of language in which linguistic parameters like words are built by perturbing a composition of existing parameters. The power of this representation is demonstrated by several examples in text segmentation and compression, acquisition of a lexicon from raw speech, and the acquisition of mappings between text and artificial representations of meaning.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Algorithms and Data Compression · Speech Recognition and Synthesis
