# Attention model for articulatory features detection

**Authors:** Ievgen Karaulov, Dmytro Tkanov

arXiv: 1907.01914 · 2019-07-04

## TL;DR

This paper applies an attention-based model to detect articulatory features and phonetic transcriptions in speech, introducing a novel decoding technique and multitask learning for joint recognition tasks.

## Contribution

It introduces a new attention-based approach for articulatory feature detection, including a novel decoding method and joint phoneme and articulatory recognition in an end-to-end framework.

## Key findings

- Effective detection of articulatory features on small datasets
- Novel decoding technique improves articulation detection
- Joint recognition enhances phoneme and feature accuracy

## Abstract

Articulatory distinctive features, as well as phonetic transcription, play important role in speech-related tasks: computer-assisted pronunciation training, text-to-speech conversion (TTS), studying speech production mechanisms, speech recognition for low-resourced languages. End-to-end approaches to speech-related tasks got a lot of traction in recent years. We apply Listen, Attend and Spell~(LAS)~\cite{Chan-LAS2016} architecture to phones recognition on a small small training set, like TIMIT~\cite{TIMIT-1992}. Also, we introduce a novel decoding technique that allows to train manners and places of articulation detectors end-to-end using attention models. We also explore joint phones recognition and articulatory features detection in multitask learning setting.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1907.01914/full.md

## Figures

2 figures with captions in the complete paper: https://tomesphere.com/paper/1907.01914/full.md

## References

21 references — full list in the complete paper: https://tomesphere.com/paper/1907.01914/full.md

---
Source: https://tomesphere.com/paper/1907.01914