Modification method for single-stage object detectors that allows to exploit the temporal behaviour of a scene to improve detection accuracy
Menua Gevorgyan

TL;DR
This paper proposes a simple modification to single-stage object detectors like YOLO and SSD that leverages temporal scene information to significantly enhance detection accuracy, especially for occluded and hidden objects in videos.
Contribution
A novel modification method for single-stage detectors that exploits temporal information to improve accuracy without extra annotated data.
Findings
Improved detection accuracy on video data.
Enhanced detection confidence for occluded and hidden objects.
Effective weakly supervised training approach.
Abstract
A simple modification method for single-stage generic object detection neural networks, such as YOLO and SSD, is proposed, which allows for improving the detection accuracy on video data by exploiting the temporal behavior of the scene in the detection pipeline. It is shown that, using this method, the detection accuracy of the base network can be considerably improved, especially for occluded and hidden objects. It is shown that a modified network is more prone to detect hidden objects with more confidence than an unmodified one. A weakly supervised training method is proposed, which allows for training a modified network without requiring any additional annotated data.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Infrared Target Detection Methodologies · CCD and CMOS Imaging Sensors
MethodsYou Only Look Once · Convolution · 1x1 Convolution · Non Maximum Suppression · SSD
