Unsupervised Protoform Reconstruction through Parsimonious Rule-guided Heuristics and Evolutionary Search
Promise Dodzi Kpoglu

TL;DR
This paper introduces an unsupervised hybrid approach combining heuristics and evolutionary search to reconstruct ancestral protoforms, improving accuracy and phonological plausibility over existing probabilistic methods.
Contribution
It presents a novel hybrid method that integrates rule-based heuristics with evolutionary optimization for protoform reconstruction, advancing beyond purely probabilistic models.
Findings
Significant accuracy improvements over baseline methods
Enhanced phonological plausibility of reconstructed protoforms
Effective on Latin cognate datasets from Romance languages
Abstract
We propose an unsupervised method for the reconstruction of protoforms i.e., ancestral word forms from which modern language forms are derived. While prior work has primarily relied on probabilistic models of phonological edits to infer protoforms from cognate sets, such approaches are limited by their predominantly data-driven nature. In contrast, our model integrates data-driven inference with rule-based heuristics within an evolutionary optimization framework. This hybrid approach leverages on both statistical patterns and linguistically motivated constraints to guide the reconstruction process. We evaluate our method on the task of reconstructing Latin protoforms using a dataset of cognates from five Romance languages. Experimental results demonstrate substantial improvements over established baselines across both character-level accuracy and phonological plausibility metrics.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Language and cultural evolution · Authorship Attribution and Profiling
