# Learning Unions of k-Testable Languages

**Authors:** Alexis Linard, Colin de la Higuera, Frits Vaandrager

arXiv: 1812.08269 · 2018-12-21

## TL;DR

This paper introduces an efficient method for learning unions of k-testable languages, which are defined by specific prefix, infix, and suffix constraints, improving grammatical inference from example data.

## Contribution

It establishes a Galois connection and a metric on k-testable languages, leading to a novel algorithm for learning unions of these languages from examples.

## Key findings

- Algorithm effectively learns unions of k-testable languages
- Demonstrated success on industrial dataset
- Provides theoretical foundations for language decomposition

## Abstract

A classical problem in grammatical inference is to identify a language from a set of examples. In this paper, we address the problem of identifying a union of languages from examples that belong to several different unknown languages. Indeed, decomposing a language into smaller pieces that are easier to represent should make learning easier than aiming for a too generalized language. In particular, we consider k-testable languages in the strict sense (k-TSS). These are defined by a set of allowed prefixes, infixes (sub-strings) and suffixes that words in the language may contain. We establish a Galois connection between the lattice of all languages over alphabet {\Sigma}, and the lattice of k-TSS languages over {\Sigma}. We also define a simple metric on k-TSS languages. The Galois connection and the metric allow us to derive an efficient algorithm to learn the union of k-TSS languages. We evaluate our algorithm on an industrial dataset and thus demonstrate the relevance of our approach.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1812.08269/full.md

## Figures

8 figures with captions in the complete paper: https://tomesphere.com/paper/1812.08269/full.md

## References

17 references — full list in the complete paper: https://tomesphere.com/paper/1812.08269/full.md

---
Source: https://tomesphere.com/paper/1812.08269