SpliceOut: A Simple and Efficient Audio Augmentation Method
Arjit Jain, Pranay Reddy Samala, Deepak Mittal, Preethi Jyoti, Maneesh, Singh

TL;DR
SpliceOut is a simple, efficient audio augmentation technique that improves performance across various speech and audio tasks, including supervised, semi-supervised, and self-supervised learning, by modifying time masking.
Contribution
It introduces SpliceOut, a novel modification to time masking that enhances computational efficiency and broad applicability in audio augmentation.
Findings
SpliceOut matches or outperforms SpecAugment on multiple tasks.
It provides additional gains when combined with other augmentation methods.
Effective in supervised, semi-supervised, and self-supervised settings.
Abstract
Time masking has become a de facto augmentation technique for speech and audio tasks, including automatic speech recognition (ASR) and audio classification, most notably as a part of SpecAugment. In this work, we propose SpliceOut, a simple modification to time masking which makes it computationally more efficient. SpliceOut performs comparably to (and sometimes outperforms) SpecAugment on a wide variety of speech and audio tasks, including ASR for seven different languages using varying amounts of training data, as well as on speech translation, sound and music classification, thus establishing itself as a broadly applicable audio augmentation method. SpliceOut also provides additional gains when used in conjunction with other augmentation techniques. Apart from the fully-supervised setting, we also demonstrate that SpliceOut can complement unsupervised representation learning with…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech and Audio Processing · Speech Recognition and Synthesis · Music and Audio Processing
