# Attentive Adversarial Learning for Domain-Invariant Training

**Authors:** Zhong Meng, Jinyu Li, Yifan Gong

arXiv: 1904.12400 · 2019-04-30

## TL;DR

This paper introduces an attentive adversarial training method that enhances domain-invariant acoustic features for speech recognition by focusing on more domain-sensitive components, leading to improved WER performance.

## Contribution

The paper proposes an attention mechanism within ADIT to automatically weight features, improving domain-invariance and discriminability without altering the core DNN architecture.

## Key findings

- Achieves 13.6% relative WER reduction over multi-conditional model
- Achieves 9.3% relative WER reduction over baseline ADIT
- Attention mechanism enhances focus on domain-sensitive phonetic features

## Abstract

Adversarial domain-invariant training (ADIT) proves to be effective in suppressing the effects of domain variability in acoustic modeling and has led to improved performance in automatic speech recognition (ASR). In ADIT, an auxiliary domain classifier takes in equally-weighted deep features from a deep neural network (DNN) acoustic model and is trained to improve their domain-invariance by optimizing an adversarial loss function. In this work, we propose an attentive ADIT (AADIT) in which we advance the domain classifier with an attention mechanism to automatically weight the input deep features according to their importance in domain classification. With this attentive re-weighting, AADIT can focus on the domain normalization of phonetic components that are more susceptible to domain variability and generates deep features with improved domain-invariance and senone-discriminativity over ADIT. Most importantly, the attention block serves only as an external component to the DNN acoustic model and is not involved in ASR, so AADIT can be used to improve the acoustic modeling with any DNN architectures. More generally, the same methodology can improve any adversarial learning system with an auxiliary discriminator. Evaluated on CHiME-3 dataset, the AADIT achieves 13.6% and 9.3% relative WER improvements, respectively, over a multi-conditional model and a strong ADIT baseline.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1904.12400/full.md

## Figures

1 figure with captions in the complete paper: https://tomesphere.com/paper/1904.12400/full.md

## References

39 references — full list in the complete paper: https://tomesphere.com/paper/1904.12400/full.md

---
Source: https://tomesphere.com/paper/1904.12400