Object Tracking and Geo-localization from Street Images
Daniel Wilson, Thayer Alshaabi, Colin Van Oort, Xiaohan Zhang,, Jonathan Nelson, Safwan Wshah

TL;DR
This paper introduces a two-stage system for detecting and geolocalizing traffic signs from street videos, combining a modified RetinaNet with a custom tracker to improve road asset mapping and autonomous driving applications.
Contribution
The paper presents GPS-RetinaNet, a modified object detector predicting positional offsets, and a novel tracking method using a learned metric network and Hungarian Algorithm, trained on an expanded ARTS dataset.
Findings
Effective detection and geolocalization of traffic signs from low frame rate videos.
Improved accuracy in associating signs across multiple images.
The ARTS dataset supports diverse environment scenarios for future research.
Abstract
Geo-localizing static objects from street images is challenging but also very important for road asset mapping and autonomous driving. In this paper we present a two-stage framework that detects and geolocalizes traffic signs from low frame rate street videos. Our proposed system uses a modified version of RetinaNet (GPS-RetinaNet), which predicts a positional offset for each sign relative to the camera, in addition to performing the standard classification and bounding box regression. Candidate sign detections from GPS-RetinaNet are condensed into geolocalized signs by our custom tracker, which consists of a learned metric network and a variant of the Hungarian Algorithm. Our metric network estimates the similarity between pairs of detections, then the Hungarian Algorithm matches detections across images using the similarity scores provided by the metric network. Our models were…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAutomated Road and Building Extraction · Advanced Neural Network Applications · Video Surveillance and Tracking Methods
MethodsFeature Pyramid Network · 1x1 Convolution · Convolution · Focal Loss · RetinaNet
