NeMo: a toolkit for building AI applications using Neural Modules
Oleksii Kuchaiev, Jason Li, Huyen Nguyen, Oleksii Hrinchuk, Ryan, Leary, Boris Ginsburg, Samuel Kriman, Stanislav Beliaev, Vitaly Lavrukhin,, Jack Cook, Patrice Castonguay, Mariya Popova, Jocelyn Huang, Jonathan M., Cohen

TL;DR
NeMo is an open-source Python toolkit that simplifies building AI applications by enabling modular, reusable neural components with semantic correctness, supporting speech and language tasks with distributed training capabilities.
Contribution
It introduces a neural module-based framework with a type system for semantic correctness, facilitating flexible, reusable AI component assembly for speech and language applications.
Findings
Supports distributed training and mixed precision on NVIDIA GPUs
Provides extendable collections for speech recognition and NLP
Enhances modularity and reusability in AI application development
Abstract
NeMo (Neural Modules) is a Python framework-agnostic toolkit for creating AI applications through re-usability, abstraction, and composition. NeMo is built around neural modules, conceptual blocks of neural networks that take typed inputs and produce typed outputs. Such modules typically represent data layers, encoders, decoders, language models, loss functions, or methods of combining activations. NeMo makes it easy to combine and re-use these building blocks while providing a level of semantic correctness checking via its neural type system. The toolkit comes with extendable collections of pre-built modules for automatic speech recognition and natural language processing. Furthermore, NeMo provides built-in support for distributed training and mixed precision on latest NVIDIA GPUs. NeMo is open-source https://github.com/NVIDIA/NeMo
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Speech Recognition and Synthesis
