Mobile Recognition of Wikipedia Featured Sites using Deep Learning and   Crowd-sourced Imagery

Jimin Tan; Anastasios Noulas; Diego S\'aez; Rossano Schifanella

arXiv:1910.09705·cs.CV·November 5, 2019

Mobile Recognition of Wikipedia Featured Sites using Deep Learning and Crowd-sourced Imagery

Jimin Tan, Anastasios Noulas, Diego S\'aez, Rossano Schifanella

PDF

Open Access

TL;DR

This paper presents a mobile app that recognizes Wikipedia-featured sites using deep learning trained on crowd-sourced images, addressing challenges with visual similarity and environmental noise, and leveraging contextual data and unsupervised denoising.

Contribution

It introduces an end-to-end pipeline for site recognition combining crowd-sourced data, contextual information, and unsupervised denoising techniques, advancing mobile recognition in urban environments.

Findings

01

Contextual information improves recognition accuracy.

02

Unsupervised denoising enhances model performance.

03

Application performs well in real-world scenarios.

Abstract

Rendering Wikipedia content through mobile and augmented reality mediums can enable new forms of interaction in urban-focused user communities facilitating learning, communication and knowledge exchange. With this objective in mind, in this work we develop a mobile application that allows for the recognition of notable sites featured on Wikipedia. The application is powered by a deep neural network that has been trained on crowd-sourced imagery describing sites of interest, such as buildings, statues, museums or other physical entities that are present and visually accessible in an urban environment. We describe an end-to-end pipeline that describes data collection, model training and evaluation of our application considering online and real world scenarios. We identify a number of challenges in the site recognition task which arise due to visual similarities amongst the classified…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Advanced Image and Video Retrieval Techniques · Domain Adaptation and Few-Shot Learning