Scene Retrieval for Contextual Visual Mapping
William H. B. Smith, Michael Milford, Klaus D. McDonald-Maier, Shoaib, Ehsan

TL;DR
This paper introduces a novel scene retrieval approach using a CNN and a new algorithm to improve visual mapping accuracy by better classifying scene types and selectively including images based on scene context.
Contribution
It defines the problem of scene retrieval, proposes a CNN trained with triplet loss for scene classification, and introduces the DMC algorithm to enhance visual mapping with scene context.
Findings
Scene retrieval improves scene classification accuracy by up to 7%.
DMC increases inclusion of scene-specific images by 64%.
DMC enhances localization accuracy by 3-10% across datasets.
Abstract
Visual navigation localizes a query place image against a reference database of place images, also known as a `visual map'. Localization accuracy requirements for specific areas of the visual map, `scene classes', vary according to the context of the environment and task. State-of-the-art visual mapping is unable to reflect these requirements by explicitly targetting scene classes for inclusion in the map. Four different scene classes, including pedestrian crossings and stations, are identified in each of the Nordland and St. Lucia datasets. Instead of re-training separate scene classifiers which struggle with these overlapping scene classes we make our first contribution: defining the problem of `scene retrieval'. Scene retrieval extends image retrieval to classification of scenes defined at test time by associating a single query image to reference images of scene classes. Our second…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Image and Video Retrieval Techniques · Robotics and Sensor-Based Localization · Multimodal Machine Learning Applications
