# Towards Using Context-Dependent Symbols in CTC Without State-Tying   Decision Trees

**Authors:** Jan Chorowski, Adrian Lancucki, Bartosz Kostka, Michal Zapotoczny

arXiv: 1901.04379 · 2019-04-24

## TL;DR

This paper proposes a novel approach to training CTC-based neural acoustic models with context-dependent symbols using an embedding network, eliminating the need for decision trees and improving generalization.

## Contribution

It introduces a CD symbol embedding network trained jointly with the acoustic model, replacing traditional decision trees for context-dependent symbol modeling in CTC.

## Key findings

- Utterance-level normalization reduces overfitting.
- Embedding network improves generalization to unseen symbols.
- Eliminates reliance on GMM-HMM decision trees.

## Abstract

Deep neural acoustic models benefit from context-dependent (CD) modeling of output symbols. We consider direct training of CTC networks with CD outputs, and identify two issues. The first one is frame-level normalization of probabilities in CTC, which induces strong language modeling behavior that leads to overfitting and interference with external language models. The second one is poor generalization in the presence of numerous lexical units like triphones or tri-chars. We mitigate the former with utterance-level normalization of probabilities. The latter typically requires reducing the CD symbol inventory with state-tying decision trees, which have to be transferred from classical GMM-HMM systems. We replace the trees with a CD symbol embedding network, which saves parameters and ensures generalization to unseen and undersampled CD symbols. The embedding network is trained together with the rest of the acoustic model and removes one of the last cases in which neural systems have to be bootstrapped from GMM-HMM ones.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1901.04379/full.md

## Figures

6 figures with captions in the complete paper: https://tomesphere.com/paper/1901.04379/full.md

## References

27 references — full list in the complete paper: https://tomesphere.com/paper/1901.04379/full.md

---
Source: https://tomesphere.com/paper/1901.04379