Classifying Words with 3-sort Automata

Tomasz Jastrz\k{a}b; Fr\'ed\'eric Lardeux; Eric Monfroy

arXiv:2401.01314·cs.FL·January 3, 2024·1 cites

Classifying Words with 3-sort Automata

Tomasz Jastrz\k{a}b, Fr\'ed\'eric Lardeux, Eric Monfroy

PDF

Open Access

TL;DR

This paper introduces models for inferring 3-sort non-deterministic finite automata from data, transforming them into probabilistic NFAs, and demonstrates their effectiveness in classification tasks on various datasets.

Contribution

It presents a novel approach for inferring 3-sort NFAs and converting them into probabilistic models for classification, with experimental validation.

Findings

01

Probabilistic NFAs perform well in classification tasks.

02

The approach works on real-life and benchmark datasets.

03

Transformation from 3-sort NFA to probabilistic NFA is effective.

Abstract

Grammatical inference consists in learning a language or a grammar from data. In this paper, we consider a number of models for inferring a non-deterministic finite automaton (NFA) with 3 sorts of states, that must accept some words, and reject some other words from a given sample. We then propose a transformation from this 3-sort NFA into weighted-frequency and probabilistic NFA, and we apply the latter to a classification task. The experimental evaluation of our approach shows that the probabilistic NFAs can be successfully applied for classification tasks on both real-life and superficial benchmark data sets.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning and Algorithms · semigroups and automata theory · Natural Language Processing Techniques