Increasing Data Diversity with Iterative Sampling to Improve Performance

Devrim Cavusoglu; Ogulcan Eryuksel; Sinan Altinuc

arXiv:2111.03743·cs.LG·November 9, 2021·1 cites

Increasing Data Diversity with Iterative Sampling to Improve Performance

Devrim Cavusoglu, Ogulcan Eryuksel, Sinan Altinuc

PDF

Open Access

TL;DR

This paper presents an iterative sampling method to enhance data diversity in training datasets, aiming to improve model performance by focusing on difficult and edge-case samples through diverse augmentation techniques.

Contribution

It introduces a data-centric approach that leverages iterative sampling and augmentation diversity to boost training data quality and model accuracy.

Findings

01

Improved model performance with increased data diversity.

02

Enhanced focus on difficult and edge-case classes.

03

Effective use of diverse augmentation methods.

Abstract

As a part of the Data-Centric AI Competition, we propose a data-centric approach to improve the diversity of the training samples by iterative sampling. The method itself relies strongly on the fidelity of augmented samples and the diversity of the augmentation methods. Moreover, we improve the performance further by introducing more samples for the difficult classes especially providing closer samples to edge cases potentially those the model at hand misclassifies.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning and Data Classification · Imbalanced Data Classification Techniques · Domain Adaptation and Few-Shot Learning