Graph attentive feature aggregation for text-independent speaker verification
Hye-jin Shim, Jungwoo Heo, Jae-han Park, Ga-hui Lee, Ha-Jin Yu

TL;DR
This paper introduces a graph attentive feature aggregation module that models pairwise relationships between frame-level features for improved text-independent speaker verification, demonstrating consistent performance gains.
Contribution
It proposes a novel graph attention-based module for aggregating frame-level features, directly modeling inter-feature relationships for speaker verification.
Findings
Achieves over 10% relative improvement on baseline systems.
Effectively models pairwise feature relationships using graph attention.
Demonstrates versatility across different architectures and input features.
Abstract
The objective of this paper is to combine multiple frame-level features into a single utterance-level representation considering pairwise relationship. For this purpose, we propose a novel graph attentive feature aggregation module by interpreting each frame-level feature as a node of a graph. The inter-relationship between all possible pairs of features, typically exploited indirectly, can be directly modeled using a graph. The module comprises a graph attention layer and a graph pooling layer followed by a readout operation. The graph attention layer first models the non-Euclidean data manifold between different nodes. Then, the graph pooling layer discards less informative nodes considering the significance of the nodes. Finally, the readout operation combines the remaining nodes into a single representation. We employ two recent systems, SE-ResNet and RawNet2, with different input…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech Recognition and Synthesis · Natural Language Processing Techniques · Topic Modeling
