Where is this? Video geolocation based on neural network features

Salvador Medina; Zhuyun Dai; Yingkai Gao

arXiv:1810.09068·cs.CV·October 24, 2018·1 cites

Where is this? Video geolocation based on neural network features

Salvador Medina, Zhuyun Dai, Yingkai Gao

PDF

Open Access

TL;DR

This paper introduces a neural network-based video geolocation method that uses image retrieval and voting techniques to accurately locate videos within a city area, achieving high precision.

Contribution

It presents a novel voting-based aggregation method combining deep learning features and traditional image similarity for improved video geolocation accuracy.

Findings

01

Achieved 90% precision within 150 meters

02

Developed a new Pittsburgh Downtown video dataset

03

Demonstrated effectiveness of combined NetVLAD and SIFT features

Abstract

In this work we propose a method that geolocates videos within a delimited widespread area based solely on the frames visual content. Our proposed method tackles video-geolocation through traditional image retrieval techniques considering Google Street View as the reference point. To achieve this goal we use the deep learning features obtained from NetVLAD to represent images, since through this feature vectors the similarity is their L2 norm. In this paper, we propose a family of voting-based methods to aggregate frame-wise geolocation results which boost the video geolocation result. The best aggregation found through our experiments considers both NetVLAD and SIFT similarity, as well as the geolocation density of the most similar results. To test our proposed method, we gathered a new video dataset from Pittsburgh Downtown area to benefit and stimulate more work in this area. Our…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Image and Video Retrieval Techniques · Robotics and Sensor-Based Localization · Multimodal Machine Learning Applications