Learnable Spectro-temporal Receptive Fields for Robust Voice Type   Discrimination

Tyler Vuong; Yangyang Xia; Richard Stern

arXiv:2010.09151·eess.AS·November 12, 2020·Interspeech

Learnable Spectro-temporal Receptive Fields for Robust Voice Type Discrimination

Tyler Vuong, Yangyang Xia, Richard Stern

PDF

1 Repo

TL;DR

This paper introduces a deep learning VTD system with learnable spectro-temporal receptive fields that significantly improves robustness and performance in voice type discrimination and spoofing detection tasks.

Contribution

It proposes a novel deep learning approach with learnable STRFs, enhancing robustness and accuracy over static methods in VTD and spoofing detection.

Findings

01

Learnable STRFs outperform static STRFs in VTD.

02

System improves baseline performance across various SNRs.

03

Effective in spoofing detection with distractor noise.

Abstract

Voice Type Discrimination (VTD) refers to discrimination between regions in a recording where speech was produced by speakers that are physically within proximity of the recording device ("Live Speech") from speech and other types of audio that were played back such as traffic noise and television broadcasts ("Distractor Audio"). In this work, we propose a deep-learning-based VTD system that features an initial layer of learnable spectro-temporal receptive fields (STRFs). Our approach is also shown to provide very strong performance on a similar spoofing detection task in the ASVspoof 2019 challenge. We evaluate our approach on a new standardized VTD database that was collected to support research in this area. In particular, we study the effect of using learnable STRFs compared to static STRFs or unconstrained kernels. We also show that our system consistently improves a competitive…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

raymondxyy/strfnet-IS2020
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.