TheanoLM - An Extensible Toolkit for Neural Network Language Modeling

Seppo Enarvi; Mikko Kurimo

arXiv:1605.00942·cs.CL·July 13, 2017

TheanoLM - An Extensible Toolkit for Neural Network Language Modeling

Seppo Enarvi, Mikko Kurimo

PDF

TL;DR

TheanoLM is a flexible, fast toolkit for neural network language modeling that improves speech recognition performance and training efficiency compared to existing tools, leveraging Theano for extensibility and GPU acceleration.

Contribution

It introduces an extensible Python-based toolkit for neural language modeling that achieves faster training and comparable or better results than existing systems.

Findings

01

Significant improvement over back-off n-gram models.

02

Comparable or better performance than existing RNNLM and RWTHLM toolkits.

03

Training times are an order of magnitude shorter.

Abstract

We present a new tool for training neural network language models (NNLMs), scoring sentences, and generating text. The tool has been written using Python library Theano, which allows researcher to easily extend it and tune any aspect of the training process. Regardless of the flexibility, Theano is able to generate extremely fast native code that can utilize a GPU or multiple CPU cores in order to parallelize the heavy numerical computations. The tool has been evaluated in difficult Finnish and English conversational speech recognition tasks, and significant improvement was obtained over our best back-off n-gram models. The results that we obtained in the Finnish task were compared to those from existing RNNLM and RWTHLM toolkits, and found to be as good or better, while training times were an order of magnitude shorter.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.