An Environmental Feature Representation in I-vector Space for Room   Verification and Metadata Estimation

Desmond Caulley

arXiv:2203.04880·cs.SD·March 10, 2022·1 cites

An Environmental Feature Representation in I-vector Space for Room Verification and Metadata Estimation

Desmond Caulley

PDF

Open Access

TL;DR

This paper explores the use of environmental feature representations derived from i-vectors for room verification and acoustic metadata estimation, demonstrating their effectiveness and proposing methods to enhance verification accuracy.

Contribution

It introduces the application of e-vectors for room verification and proposes new methods for estimating SNR and reverberation from these features.

Findings

01

E-vectors can be effectively used for room verification with low error rates.

02

Proposed methods accurately estimate SNR and reverberation from e-vectors.

03

Augmenting e-vectors with metadata improves room verification performance.

Abstract

This paper investigates the application of environmental feature representations for room verification tasks and acoustic meta-data estimation. Audio recordings contain both speaker and non-speaker information. We refer to the non-speaker-related information, including channel and other environmental factors, as e-vectors. I-vectors, commonly used in speaker identification, are extracted in the total variability space and capture both speaker and channel-environment information without discrimination. Accordingly, e-vectors can be extracted from i-vectors using methods such as linear discriminant analysis. In this paper, we first demonstrate that e-vectors can be successfully applied to room verification tasks with a low equal error rate. Second, we propose two methods for estimating metadata information -- signal-to-noise (SNR) and reverberation (T60) -- from these e-vectors. When…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech and Audio Processing · Music and Audio Processing · Speech Recognition and Synthesis