Unsupervised Learning of Morphology without Morphemes
Sylvain Neuvel, Sean A. Fulop

TL;DR
This paper introduces a novel unsupervised morphological learning method based on Whole Word Morphology that induces morphological relationships without morpheme segmentation, achieving high accuracy in generating new words.
Contribution
It presents the first morphological learner based on Whole Word Morphology that induces relationships without morpheme discovery and demonstrates promising accuracy.
Findings
80% accuracy with pure Whole Word theory
92% accuracy after post-hoc adjustment
Able to generate valid new words beyond training data
Abstract
The first morphological learner based upon the theory of Whole Word Morphology Ford et al. (1997) is outlined, and preliminary evaluation results are presented. The program, Whole Word Morphologizer, takes a POS-tagged lexicon as input, induces morphological relationships without attempting to discover or identify morphemes, and is then able to generate new words beyond the learning sample. The accuracy (precision) of the generated new words is as high as 80% using the pure Whole Word theory, and 92% after a post-hoc adjustment is added to the routine.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Algorithms and Data Compression · Topic Modeling
