Enhancing Semi-Supervised Learning via Representative and Diverse Sample   Selection

Qian Shao; Jiangrui Kang; Qiyuan Chen; Zepeng Li; Hongxia Xu; Yiwen; Cao; Jiajuan Liang; Jian Wu

arXiv:2409.11653·cs.LG·October 29, 2024

Enhancing Semi-Supervised Learning via Representative and Diverse Sample Selection

Qian Shao, Jiangrui Kang, Qiyuan Chen, Zepeng Li, Hongxia Xu, Yiwen, Cao, Jiajuan Liang, Jian Wu

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper introduces RDSS, a novel sample selection method for semi-supervised learning that improves performance under low annotation budgets by selecting representative and diverse samples using a modified Frank-Wolfe algorithm.

Contribution

It proposes a new sample selection approach, RDSS, based on minimizing $oldsymbol{ m oldsymbol{ extalpha}}$-MMD, to enhance SSL performance in low-budget settings.

Findings

01

RDSS outperforms existing sample selection methods in SSL.

02

RDSS improves generalization in low-budget SSL scenarios.

03

Experimental results validate the effectiveness of RDSS across multiple frameworks.

Abstract

Semi-Supervised Learning (SSL) has become a preferred paradigm in many deep learning tasks, which reduces the need for human labor. Previous studies primarily focus on effectively utilising the labelled and unlabeled data to improve performance. However, we observe that how to select samples for labelling also significantly impacts performance, particularly under extremely low-budget settings. The sample selection task in SSL has been under-explored for a long time. To fill in this gap, we propose a Representative and Diverse Sample Selection approach (RDSS). By adopting a modified Frank-Wolfe algorithm to minimise a novel criterion $α$ -Maximum Mean Discrepancy ( $α$ -MMD), RDSS samples a representative and diverse subset for annotation from the unlabeled data. We demonstrate that minimizing $α$ -MMD enhances the generalization ability of low-budget learning. Experimental…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

yanhuiailab/rdss
pytorchOfficial

Videos

Enhancing Semi-Supervised Learning via Representative and Diverse Sample Selection· slideslive

Taxonomy

TopicsFace and Expression Recognition

MethodsFocus