Out-Of-Distribution Detection for Audio-visual Generalized Zero-Shot Learning: A General Framework

Liuyuan Wen

arXiv:2408.01284·cs.MM·September 15, 2025

Out-Of-Distribution Detection for Audio-visual Generalized Zero-Shot Learning: A General Framework

Liuyuan Wen

PDF

Open Access 1 Repo

TL;DR

This paper proposes a unified framework for audio-visual generalized zero-shot learning that integrates embedding and generative methods using out-of-distribution detection to improve classification of seen and unseen classes.

Contribution

It introduces a novel framework combining generative adversarial networks and OOD detection for audio-visual GZSL, addressing limitations of existing methods.

Findings

01

Significant performance improvements over state-of-the-art methods.

02

Effective synthesis of unseen features using GANs.

03

Robust OOD detection for better class differentiation.

Abstract

Generalized Zero-Shot Learning (GZSL) is a challenging task requiring accurate classification of both seen and unseen classes. Within this domain, Audio-visual GZSL emerges as an extremely exciting yet difficult task, given the inclusion of both visual and acoustic features as multi-modal inputs. Existing efforts in this field mostly utilize either embedding-based or generative-based methods. However, generative training is difficult and unstable, while embedding-based methods often encounter domain shift problem. Thus, we find it promising to integrate both methods into a unified framework to leverage their advantages while mitigating their respective disadvantages. Our study introduces a general framework employing out-of-distribution (OOD) detection, aiming to harness the strengths of both approaches. We first employ generative adversarial networks to synthesize unseen features,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

liuyuan-wen/av-ood-gzsl
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDomain Adaptation and Few-Shot Learning · Geophysical Methods and Applications