dopanim: A Dataset of Doppelganger Animals with Noisy Annotations from   Multiple Humans

Marek Herde; Denis Huseljic; Lukas Rauch; Bernhard Sick

arXiv:2407.20950·cs.CV·July 31, 2024

dopanim: A Dataset of Doppelganger Animals with Noisy Annotations from Multiple Humans

Marek Herde, Denis Huseljic, Lukas Rauch, Bernhard Sick

PDF

Open Access 1 Repo 1 Video

TL;DR

The paper introduces dopanim, a large, noisy-annotated animal image dataset with multiple human annotations, designed to evaluate methods for handling noisy labels in machine learning.

Contribution

It provides a novel benchmark dataset with multiple annotations per image, human-estimated likelihoods, and metadata, enabling empirical evaluation of noisy label learning methods.

Findings

01

Benchmark results for multi-annotator learning approaches

02

Demonstration of learning beyond hard class labels

03

Evaluation of active learning strategies

Abstract

Human annotators typically provide annotated data for training machine learning models, such as neural networks. Yet, human annotations are subject to noise, impairing generalization performances. Methodological research on approaches counteracting noisy annotations requires corresponding datasets for a meaningful empirical evaluation. Consequently, we introduce a novel benchmark dataset, dopanim, consisting of about 15,750 animal images of 15 classes with ground truth labels. For approximately 10,500 of these images, 20 humans provided over 52,000 annotations with an accuracy of circa 67%. Its key attributes include (1) the challenging task of classifying doppelganger animals, (2) human-estimated likelihoods as annotations, and (3) annotator metadata. We benchmark well-known multi-annotator learning approaches using seven variants of this dataset and outline further evaluation use…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

ies-research/multi-annotator-machine-learning
pytorchOfficial

Videos

dopanim: A Dataset of Doppelganger Animals with Noisy Annotations from Multiple Humans· slideslive

Taxonomy

TopicsHuman Pose and Action Recognition