Learning Deterministic Weighted Automata with Queries and   Counterexamples

Gail Weiss; Yoav Goldberg; and Eran Yahav

arXiv:1910.13895·cs.LG·January 1, 2020·24 cites

Learning Deterministic Weighted Automata with Queries and Counterexamples

Gail Weiss, Yoav Goldberg, and Eran Yahav

PDF

Open Access 1 Repo

TL;DR

This paper introduces an algorithm for extracting probabilistic deterministic finite automata from black-box models like RNNs, improving accuracy over spectral methods by leveraging conditional probabilities and noise tolerance.

Contribution

It adapts the L* algorithm for probabilistic settings, enabling more expressive and deterministic automata extraction from neural networks.

Findings

01

Achieves better word error rate (WER) and NDCG than spectral methods.

02

PDFAs are more expressive than n-grams and guaranteed to be stochastic and deterministic.

03

The algorithm effectively extracts automata from RNNs with improved accuracy.

Abstract

We present an algorithm for extraction of a probabilistic deterministic finite automaton (PDFA) from a given black-box language model, such as a recurrent neural network (RNN). The algorithm is a variant of the exact-learning algorithm L*, adapted to a probabilistic setting with noise. The key insight is the use of conditional probabilities for observations, and the introduction of a local tolerance when comparing them. When applied to RNNs, our algorithm often achieves better word error rate (WER) and normalised distributed cumulative gain (NDCG) than that achieved by spectral extraction of weighted finite automata (WFA) from the same networks. PDFAs are substantially more expressive than n-grams, and are guaranteed to be stochastic and deterministic - unlike spectrally extracted WFAs.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

tech-srl/weighted_lstar
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning and Algorithms · Natural Language Processing Techniques · Topic Modeling