Data Augmentation of Room Classifiers using Generative Adversarial Networks
Constantinos Papayiannis, Christine Evers, Patrick A. Naylor

TL;DR
This paper introduces a GAN-based data augmentation technique for room classification using reverberant speech, significantly improving classifier accuracy without additional data collection.
Contribution
It proposes a novel acoustic environment representation and a GAN training method to generate realistic artificial room data for improved classification.
Findings
Test accuracy increased from 89.4% to 95.5%.
GAN-generated data effectively enhances classifier performance.
Method reduces need for extensive real-world data collection.
Abstract
The classification of acoustic environments allows for machines to better understand the auditory world around them. The use of deep learning in order to teach machines to discriminate between different rooms is a new area of research. Similarly to other learning tasks, this task suffers from the high-dimensionality and the limited availability of training data. Data augmentation methods have proven useful in addressing this issue in the tasks of sound event detection and scene classification. This paper proposes a method for data augmentation for the task of room classification from reverberant speech. Generative Adversarial Networks (GANs) are trained that generate artificial data as if they were measured in real rooms. This provides additional training examples to the classifiers without the need for any additional data collection, which is time-consuming and often impractical. A…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech and Audio Processing · Music and Audio Processing · Hearing Loss and Rehabilitation
