Truncate-Split-Contrast: A Framework for Learning from Mislabeled Videos
Zixiao Wang, Junwu Weng, Chun Yuan, and Jue Wang

TL;DR
This paper introduces NEAT, a framework for learning from noisy labels in videos, using channel truncation for noise detection and contrastive learning to improve classification accuracy significantly.
Contribution
It proposes two novel strategies tailored for video noisy label learning: channel truncation for feature selection and noise contrastive learning for regularization.
Findings
Achieves over 0.4 noise detection F1-score under severe noise.
Improves classification accuracy by over 5% on Mini-Kinetics.
Enhances average accuracy by over 1.6% with noise contrastive learning.
Abstract
Learning with noisy label (LNL) is a classic problem that has been extensively studied for image tasks, but much less for video in the literature. A straightforward migration from images to videos without considering the properties of videos, such as computational cost and redundant information, is not a sound choice. In this paper, we propose two new strategies for video analysis with noisy labels: 1) A lightweight channel selection method dubbed as Channel Truncation for feature-based label noise detection. This method selects the most discriminative channels to split clean and noisy instances in each category; 2) A novel contrastive strategy dubbed as Noise Contrastive Learning, which constructs the relationship between clean and noisy instances to regularize model training. Experiments on three well-known benchmark datasets for video classification show that our proposed tru{\bf…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsMachine Learning and Data Classification · Anomaly Detection Techniques and Applications
MethodsContrastive Learning
