Combi-CAM: A Novel Multi-Layer Approach for Explainable Image Geolocalization
David Faget (CB), Jos\'e Luis Lisani, Miguel Colom (CB, CMLA)

TL;DR
Combi-CAM introduces a multi-layer explainability method for CNN-based image geolocalization, improving interpretability by combining activation maps from various network layers to better understand decision factors.
Contribution
This paper proposes Combi-CAM, a novel multi-layer approach that enhances CNN explainability in geolocalization by integrating gradient-weighted activation maps from multiple layers.
Findings
Provides more detailed feature attribution insights.
Improves understanding of CNN decision-making in geolocalization.
Enhances interpretability over traditional single-layer methods.
Abstract
Planet-scale photo geolocalization involves the intricate task of estimating the geographic location depicted in an image purely based on its visual features. While deep learning models, particularly convolutional neural networks (CNNs), have significantly advanced this field, understanding the reasoning behind their predictions remains challenging. In this paper, we present Combi-CAM, a novel method that enhances the explainability of CNN-based geolocalization models by combining gradient-weighted class activation maps obtained from several layers of the network architecture, rather than using only information from the deepest layer as is typically done. This approach provides a more detailed understanding of how different image features contribute to the model's decisions, offering deeper insights than the traditional approaches.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRemote-Sensing Image Classification · Multimodal Machine Learning Applications · Advanced Image and Video Retrieval Techniques
