Learning from Between-class Examples for Deep Sound Recognition
Yuji Tokozume, Yoshitaka Ushiku, Tatsuya Harada

TL;DR
This paper introduces Between-Class learning (BC learning) for deep sound recognition, which mixes sounds from different classes to improve model discrimination, leading to enhanced performance surpassing human levels.
Contribution
The paper proposes BC learning, a novel data mixing strategy, and develops EnvNet-v2, a new deep sound recognition network trained with BC learning.
Findings
BC learning improves recognition accuracy across various networks and datasets.
BC learning enlarges Fisher's criterion and regularizes class feature relationships.
EnvNet-v2 trained with BC learning surpasses human-level performance.
Abstract
Deep learning methods have achieved high performance in sound recognition tasks. Deciding how to feed the training data is important for further performance improvement. We propose a novel learning method for deep sound recognition: Between-Class learning (BC learning). Our strategy is to learn a discriminative feature space by recognizing the between-class sounds as between-class sounds. We generate between-class sounds by mixing two sounds belonging to different classes with a random ratio. We then input the mixed sound to the model and train the model to output the mixing ratio. The advantages of BC learning are not limited only to the increase in variation of the training data; BC learning leads to an enlargement of Fisher's criterion in the feature space and a regularization of the positional relationship among the feature distributions of the classes. The experimental results show…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMusic and Audio Processing · Speech and Audio Processing · Animal Vocal Communication and Behavior
