Multi-source Domain Adaptation for Text-independent Forensic Speaker   Recognition

Zhenyu Wang; and John H. L. Hansen

arXiv:2211.09913·cs.SD·November 21, 2022

Multi-source Domain Adaptation for Text-independent Forensic Speaker Recognition

Zhenyu Wang, and John H. L. Hansen

PDF

TL;DR

This paper introduces three novel multi-source domain adaptation methods for forensic speaker recognition, addressing challenges of diverse acoustic environments and improving performance across multiple domains.

Contribution

It proposes domain adversarial training, discrepancy minimization, and moment-matching approaches for effective multi-domain adaptation in forensic speaker recognition.

Findings

01

Diverse acoustic environments impact recognition performance.

02

Domain adversarial training learns domain-invariant features.

03

Discrepancy minimization improves multi-domain performance.

Abstract

Adapting speaker recognition systems to new environments is a widely-used technique to improve a well-performing model learned from large-scale data towards a task-specific small-scale data scenarios. However, previous studies focus on single domain adaptation, which neglects a more practical scenario where training data are collected from multiple acoustic domains needed in forensic scenarios. Audio analysis for forensic speaker recognition offers unique challenges in model training with multi-domain training data due to location/scenario uncertainty and diversity mismatch between reference and naturalistic field recordings. It is also difficult to directly employ small-scale domain-specific data to train complex neural network architectures due to domain mismatch and performance loss. Fine-tuning is a commonly-used method for adaptation in order to retrain the model with weights…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.