# wav2letter++: The Fastest Open-source Speech Recognition System

**Authors:** Vineel Pratap, Awni Hannun, Qiantong Xu, Jeff Cai, Jacob Kahn, Gabriel, Synnaeve, Vitaliy Liptchinsky, Ronan Collobert

arXiv: 1812.07625 · 2020-02-25

## TL;DR

wav2letter++ is an ultra-fast, open-source speech recognition framework written in C++ that significantly outperforms existing systems in training speed and scalability, facilitating rapid research and development.

## Contribution

The paper presents wav2letter++, a highly efficient C++-based speech recognition system that achieves over 2x speed improvements and linear scaling to 64 GPUs, advancing open-source speech recognition technology.

## Key findings

- wav2letter++ is more than 2x faster than other frameworks.
- Training times scale linearly to 64 GPUs.
- The system enables rapid iteration for research and model tuning.

## Abstract

This paper introduces wav2letter++, the fastest open-source deep learning speech recognition framework. wav2letter++ is written entirely in C++, and uses the ArrayFire tensor library for maximum efficiency. Here we explain the architecture and design of the wav2letter++ system and compare it to other major open-source speech recognition systems. In some cases wav2letter++ is more than 2x faster than other optimized frameworks for training end-to-end neural networks for speech recognition. We also show that wav2letter++'s training times scale linearly to 64 GPUs, the highest we tested, for models with 100 million parameters. High-performance frameworks enable fast iteration, which is often a crucial factor in successful research and model tuning on new datasets and tasks.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1812.07625/full.md

## Figures

6 figures with captions in the complete paper: https://tomesphere.com/paper/1812.07625/full.md

## References

19 references — full list in the complete paper: https://tomesphere.com/paper/1812.07625/full.md

---
Source: https://tomesphere.com/paper/1812.07625