Mobile Recognition of Wikipedia Featured Sites using Deep Learning and Crowd-sourced Imagery
Jimin Tan, Anastasios Noulas, Diego S\'aez, Rossano Schifanella

TL;DR
This paper presents a mobile app that recognizes Wikipedia-featured sites using deep learning trained on crowd-sourced images, addressing challenges with visual similarity and environmental noise, and leveraging contextual data and unsupervised denoising.
Contribution
It introduces an end-to-end pipeline for site recognition combining crowd-sourced data, contextual information, and unsupervised denoising techniques, advancing mobile recognition in urban environments.
Findings
Contextual information improves recognition accuracy.
Unsupervised denoising enhances model performance.
Application performs well in real-world scenarios.
Abstract
Rendering Wikipedia content through mobile and augmented reality mediums can enable new forms of interaction in urban-focused user communities facilitating learning, communication and knowledge exchange. With this objective in mind, in this work we develop a mobile application that allows for the recognition of notable sites featured on Wikipedia. The application is powered by a deep neural network that has been trained on crowd-sourced imagery describing sites of interest, such as buildings, statues, museums or other physical entities that are present and visually accessible in an urban environment. We describe an end-to-end pipeline that describes data collection, model training and evaluation of our application considering online and real world scenarios. We identify a number of challenges in the site recognition task which arise due to visual similarities amongst the classified…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Advanced Image and Video Retrieval Techniques · Domain Adaptation and Few-Shot Learning
