mask-Net: Learning Context Aware Invariant Features using Adversarial Forgetting (Student Abstract)
Hemant Yadav, Atul Anshuman Singh, Rachit Mittal, Sunayana Sitaram, Yi, Yu, Rajiv Ratn Shah

TL;DR
This paper introduces Mask-Net, a novel adversarial forgetting approach to learn invariant features for speech-to-text systems, improving generalization and reducing errors across different datasets.
Contribution
The paper proposes a new adversarial forgetting method to induce invariance in features, enhancing robustness in speech recognition models.
Findings
Achieved 2.2% absolute WER improvement on out-of-distribution data.
Achieved 1.3% absolute WER improvement on in-distribution data.
Demonstrated better generalization compared to traditional models.
Abstract
Training a robust system, e.g.,Speech to Text (STT), requires large datasets. Variability present in the dataset such as unwanted nuisances and biases are the reason for the need of large datasets to learn general representations. In this work, we propose a novel approach to induce invariance using adversarial forgetting (AF). Our initial experiments on learning invariant features such as accent on the STT task achieve better generalizations in terms of word error rate (WER) compared to the traditional models. We observe an absolute improvement of 2.2% and 1.3% on out-of-distribution and in-distribution test sets, respectively.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech Recognition and Synthesis · Topic Modeling · Multimodal Machine Learning Applications
