CrowdTeacher: Robust Co-teaching with Noisy Answers & Sample-specific Perturbations for Tabular Data
Mani Sotoodeh, Li Xiong, Joyce C. Ho

TL;DR
CrowdTeacher introduces a robust co-teaching approach for tabular data that leverages input perturbations based on crowdsourced annotation certainty to improve learning from noisy labels.
Contribution
It extends co-teaching methods with input perturbations tailored for crowdsourcing noise, enhancing robustness and accuracy in tabular data scenarios.
Findings
Outperforms baseline methods on synthetic and real datasets.
Improves predictive accuracy in noisy label conditions.
Effective across various label density settings.
Abstract
Samples with ground truth labels may not always be available in numerous domains. While learning from crowdsourcing labels has been explored, existing models can still fail in the presence of sparse, unreliable, or diverging annotations. Co-teaching methods have shown promising improvements for computer vision problems with noisy labels by employing two classifiers trained on each others' confident samples in each batch. Inspired by the idea of separating confident and uncertain samples during the training process, we extend it for the crowdsourcing problem. Our model, CrowdTeacher, uses the idea that perturbation in the input space model can improve the robustness of the classifier for noisy labels. Treating crowdsourcing annotations as a source of noisy labeling, we perturb samples based on the certainty from the aggregated annotations. The perturbed samples are fed to a Co-teaching…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
