# SylNet: An Adaptable End-to-End Syllable Count Estimator for Speech

**Authors:** Shreyas Seshadri, Okko R\"as\"anen

arXiv: 1906.09825 · 2019-09-04

## TL;DR

SylNet is a new neural network-based end-to-end system for automatic syllable counting in speech, capable of generalizing across languages and improving with limited adaptation data.

## Contribution

The paper introduces SylNet, a novel neural network architecture for syllable counting that does not require aligned syllable annotations and can adapt to new languages with minimal data.

## Key findings

- SylNet outperforms previous methods including BLSTMs.
- It generalizes well across multiple languages.
- The model improves with limited language-specific adaptation data.

## Abstract

Automatic syllable count estimation (SCE) is used in a variety of applications ranging from speaking rate estimation to detecting social activity from wearable microphones or developmental research concerned with quantifying speech heard by language-learning children in different environments. The majority of previously utilized SCE methods have relied on heuristic DSP methods, and only a small number of bi-directional long short-term memory (BLSTM) approaches have made use of modern machine learning approaches in the SCE task. This paper presents a novel end-to-end method called SylNet for automatic syllable counting from speech, built on the basis of a recent developments in neural network architectures. We describe how the entire model can be optimized directly to minimize SCE error on the training data without annotations aligned at the syllable level, and how it can be adapted to new languages using limited speech data with known syllable counts. Experiments on several different languages reveal that SylNet generalizes to languages beyond its training data and further improves with adaptation. It also outperforms several previously proposed methods for syllabification, including end-to-end BLSTMs.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1906.09825/full.md

## Figures

5 figures with captions in the complete paper: https://tomesphere.com/paper/1906.09825/full.md

## References

29 references — full list in the complete paper: https://tomesphere.com/paper/1906.09825/full.md

---
Source: https://tomesphere.com/paper/1906.09825