# Sampling from Stochastic Finite Automata with Applications to CTC   Decoding

**Authors:** Martin Jansche, Alexander Gutkin

arXiv: 1905.08760 · 2019-09-24

## TL;DR

This paper presents an efficient path-sampling algorithm for stochastic finite automata, especially useful for CTC decoding in speech processing, by handling epsilon-cycles and non-injective transformations.

## Contribution

It introduces a novel algorithm that efficiently samples from stochastic automata with epsilon-cycles and applies it to improve CTC decoding strategies.

## Key findings

- Path-sampling is effective when epsilon-graphs are acyclic.
- The algorithm handles non-injective string transformations.
- Sampling improves CTC decoding efficiency.

## Abstract

Stochastic finite automata arise naturally in many language and speech processing tasks. They include stochastic acceptors, which represent certain probability distributions over random strings. We consider the problem of efficient sampling: drawing random string variates from the probability distribution represented by stochastic automata and transformations of those. We show that path-sampling is effective and can be efficient if the epsilon-graph of a finite automaton is acyclic. We provide an algorithm that ensures this by conflating epsilon-cycles within strongly connected components. Sampling is also effective in the presence of non-injective transformations of strings. We illustrate this in the context of decoding for Connectionist Temporal Classification (CTC), where the predictive probabilities yield auxiliary sequences which are transformed into shorter labeling strings. We can sample efficiently from the transformed labeling distribution and use this in two different strategies for finding the most probable CTC labeling.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1905.08760/full.md

## Figures

13 figures with captions in the complete paper: https://tomesphere.com/paper/1905.08760/full.md

## References

12 references — full list in the complete paper: https://tomesphere.com/paper/1905.08760/full.md

---
Source: https://tomesphere.com/paper/1905.08760