Issues in Object Detection in Videos using Common Single-Image CNNs
Spencer Ploeger, Lucas Dasovic

TL;DR
This paper addresses the challenges of object detection in videos by proposing a dataset generation method using FlowNet2-Pytorch and a novel Magnitude Method, aiming to improve neural network consistency across frames.
Contribution
It introduces a new dataset creation technique and a loss function for training object detection models to better handle video data.
Findings
The Magnitude Method effectively generates ground-truth flow masks.
The proposed loss function improves consistency in object detection across frames.
System tested successfully on multiple video samples.
Abstract
A growing branch of computer vision is object detection. Object detection is used in many applications such as industrial process, medical imaging analysis, and autonomous vehicles. The ability to detect objects in videos is crucial. Object detection systems are trained on large image datasets. For applications such as autonomous vehicles, it is crucial that the object detection system can identify objects through multiple frames in video. There are many problems with applying these systems to video. Shadows or changes in brightness that can cause the system to incorrectly identify objects frame to frame and cause an unintended system response. There are many neural networks that have been used for object detection and if there was a way of connecting objects between frames then these problems could be eliminated. For these neural networks to get better at identifying objects in video,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Image Enhancement Techniques · Autonomous Vehicle Technology and Safety
MethodsRegion Proposal Network · Convolution · Softmax · RoIAlign · Mask R-CNN
