Classifying Words with 3-sort Automata
Tomasz Jastrz\k{a}b, Fr\'ed\'eric Lardeux, Eric Monfroy

TL;DR
This paper introduces models for inferring 3-sort non-deterministic finite automata from data, transforming them into probabilistic NFAs, and demonstrates their effectiveness in classification tasks on various datasets.
Contribution
It presents a novel approach for inferring 3-sort NFAs and converting them into probabilistic models for classification, with experimental validation.
Findings
Probabilistic NFAs perform well in classification tasks.
The approach works on real-life and benchmark datasets.
Transformation from 3-sort NFA to probabilistic NFA is effective.
Abstract
Grammatical inference consists in learning a language or a grammar from data. In this paper, we consider a number of models for inferring a non-deterministic finite automaton (NFA) with 3 sorts of states, that must accept some words, and reject some other words from a given sample. We then propose a transformation from this 3-sort NFA into weighted-frequency and probabilistic NFA, and we apply the latter to a classification task. The experimental evaluation of our approach shows that the probabilistic NFAs can be successfully applied for classification tasks on both real-life and superficial benchmark data sets.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Algorithms · semigroups and automata theory · Natural Language Processing Techniques
