Temporal Coherence for Active Learning in Videos
Javad Zolfaghari Bengar, Abel Gonzalez-Garcia, Gabriel Villalonga,, Bogdan Raducanu, Hamed H. Aghdam, Mikhail Mozerov, Antonio M. Lopez, Joost, van de Weijer

TL;DR
This paper presents a novel active learning method for video object detection that leverages temporal coherence to efficiently identify uncertain detections, reducing annotation effort in autonomous driving data.
Contribution
It introduces a new active learning criterion based on a graph model of temporally linked detections and a synthetic dataset for evaluation.
Findings
Outperforms baseline active learning methods on two datasets.
Effectively estimates false positives and negatives using a graph-based energy minimization.
Provides a new synthetic dataset for active learning in road scene videos.
Abstract
Autonomous driving systems require huge amounts of data to train. Manual annotation of this data is time-consuming and prohibitively expensive since it involves human resources. Therefore, active learning emerged as an alternative to ease this effort and to make data annotation more manageable. In this paper, we introduce a novel active learning approach for object detection in videos by exploiting temporal coherence. Our active learning criterion is based on the estimated number of errors in terms of false positives and false negatives. The detections obtained by the object detector are used to define the nodes of a graph and tracked forward and backward to temporally link the nodes. Minimizing an energy function defined on this graphical model provides estimates of both false positives and false negatives. Additionally, we introduce a synthetic video dataset, called SYNTHIA-AL,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
