Semi Supervised Learning For Few-shot Audio Classification By Episodic Triplet Mining
Swapnil Bhosale, Rupayan Chakraborty, Sunil Kumar Kopparapu

TL;DR
This paper introduces Episodic Triplet Mining (ETM), a semi-supervised approach that enhances few-shot audio classification by improving prototype construction and leveraging unlabeled data, outperforming existing methods.
Contribution
It proposes ETM with episodic semi-hard triplet mining and unlabeled data utilization, significantly improving few-shot audio classification performance.
Findings
ETM outperforms prototypical networks in few-shot audio tasks.
Using unlabeled data further boosts classification accuracy.
Episodic training reduces overfitting in triplet selection.
Abstract
Few-shot learning aims to generalize unseen classes that appear during testing but are unavailable during training. Prototypical networks incorporate few-shot metric learning, by constructing a class prototype in the form of a mean vector of the embedded support points within a class. The performance of prototypical networks in extreme few-shot scenarios (like one-shot) degrades drastically, mainly due to the desuetude of variations within the clusters while constructing prototypes. In this paper, we propose to replace the typical prototypical loss function with an Episodic Triplet Mining (ETM) technique. The conventional triplet selection leads to overfitting, because of all possible combinations being used during training. We incorporate episodic training for mining the semi hard positive and the semi hard negative triplets to overcome the overfitting. We also propose an adaptation to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMusic and Audio Processing · Speech Recognition and Synthesis · Domain Adaptation and Few-Shot Learning
