Granular Privacy Control for Geolocation with Vision Language Models
Ethan Mendes, Yang Chen, James Hays, Sauvik Das, Wei Xu, Alan Ritter

TL;DR
This paper highlights the privacy risks posed by vision language models' ability to geolocate images and introduces GPTGeoChat, a benchmark dataset, to evaluate and improve models' capacity to moderate geolocation information disclosure.
Contribution
The paper presents GPTGeoChat, a new benchmark dataset for testing VLMs' ability to moderate geolocation dialogue disclosures, and evaluates various models' effectiveness in privacy control.
Findings
Fine-tuned models match API-based models at country/city level geolocation detection.
Supervised fine-tuning improves moderation of fine-grained location details.
Widespread geolocation ability of VLMs poses immediate privacy risks.
Abstract
Vision Language Models (VLMs) are rapidly advancing in their capability to answer information-seeking questions. As these models are widely deployed in consumer applications, they could lead to new privacy risks due to emergent abilities to identify people in photos, geolocate images, etc. As we demonstrate, somewhat surprisingly, current open-source and proprietary VLMs are very capable image geolocators, making widespread geolocation with VLMs an immediate privacy risk, rather than merely a theoretical future concern. As a first step to address this challenge, we develop a new benchmark, GPTGeoChat, to test the ability of VLMs to moderate geolocation dialogues with users. We collect a set of 1,000 image geolocation conversations between in-house annotators and GPT-4v, which are annotated with the granularity of location information revealed at each turn. Using this new dataset, we…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsCryptography and Data Security · Privacy-Preserving Technologies in Data
MethodsSparse Evolutionary Training
