Basic concepts and tools for the Toki Pona minimal and constructed language: description of the language and main issues; analysis of the vocabulary; text synthesis and syntax highlighting; Wordnet synsets
Renato Fabbri

TL;DR
This paper presents tools and analyses for Toki Pona, a minimal constructed language, including vocabulary analysis, text synthesis, syntax highlighting, and a preliminary Wordnet, facilitating linguistic experiments and tool development.
Contribution
It introduces Python and Vim routines for Toki Pona analysis, synthesis, syntax highlighting, and a preliminary Wordnet, advancing resources for minimal conlang research.
Findings
Vocabulary analysis based on corpus data
Text synthesis using sentence templates and context tracking
Syntax highlighting schemes implemented in Vim
Abstract
A minimal constructed language (conlang) is useful for experiments and comfortable for making tools. The Toki Pona (TP) conlang is minimal both in the vocabulary (with only 14 letters and 124 lemmas) and in the (about) 10 syntax rules. The language is useful for being a used and somewhat established minimal conlang with at least hundreds of fluent speakers. This article exposes current concepts and resources for TP, and makes available Python (and Vim) scripted routines for the analysis of the language, synthesis of texts, syntax highlighting schemes, and the achievement of a preliminary TP Wordnet. Focus is on the analysis of the basic vocabulary, as corpus analyses were found. The synthesis is based on sentence templates, relates to context by keeping track of used words, and renders larger texts by using a fixed number of phonemes (e.g. for poems) and number of sentences, words and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Video Analysis and Summarization · Music and Audio Processing
