ConlangCrafter: Constructing Languages with a Multi-Hop LLM Pipeline
Morris Alper, Moran Yanuka, Raja Giryes, Ga\v{s}per Begu\v{s}

TL;DR
This paper presents ConlangCrafter, a multi-stage LLM pipeline that automates the creation of constructed languages, ensuring diversity and consistency through modular design and self-refinement.
Contribution
It introduces a novel, scalable framework for automated conlang generation using LLMs, combining modular stages with feedback mechanisms for coherence and diversity.
Findings
ConlangCrafter produces coherent, diverse conlangs without human linguistic input.
The evaluation framework effectively measures consistency and typological diversity.
Automatic and manual assessments confirm the quality of generated languages.
Abstract
Constructed languages (conlangs) such as Esperanto and Quenya have played diverse roles in art, philosophy, and international communication. Meanwhile, foundation models have revolutionized creative generation in text, images, and beyond. In this work, we leverage modern LLMs as computational creativity aids for end-to-end conlang creation. We introduce ConlangCrafter, a multi-hop pipeline that decomposes language design into modular stages -- phonology, morphology, syntax, lexicon generation, and translation. At each stage, our method leverages LLMs' metalinguistic reasoning capabilities, injecting randomness to encourage diversity and leveraging self-refinement feedback to encourage consistency in the emerging language description. We construct a novel, scalable evaluation framework for this task, evaluating metrics measuring consistency and typological diversity. Automatic and manual…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
