Do Vision-Language Models Respect Contextual Integrity in Location Disclosure?
Ruixin Yang, Ethan Mendes, Arthur Wang, James Hays, Sauvik Das, Wei Xu, Alan Ritter

TL;DR
This paper introduces VLM-GEOPRIVACY, a benchmark to evaluate how well vision-language models respect contextual privacy norms in location disclosure, revealing their tendency to over-disclose sensitive information.
Contribution
The paper presents a new benchmark for assessing VLMs' ability to interpret social norms and context in location privacy, highlighting their shortcomings in aligning with human privacy expectations.
Findings
Models often over-disclose sensitive locations.
VLMs are vulnerable to prompt-based privacy attacks.
Current models poorly align with human privacy norms.
Abstract
Vision-language models (VLMs) have demonstrated strong performance in image geolocation, a capability further sharpened by frontier multimodal large reasoning models (MLRMs). This poses a significant privacy risk, as these widely accessible models can be exploited to infer sensitive locations from casually shared photos, often at street-level precision, potentially surpassing the level of detail the sharer consented or intended to disclose. While recent work has proposed applying a blanket restriction on geolocation disclosure to combat this risk, these measures fail to distinguish valid geolocation uses from malicious behavior. Instead, VLMs should maintain contextual integrity by reasoning about elements within an image to determine the appropriate level of information disclosure, balancing privacy and utility. To evaluate how well models respect contextual integrity, we introduce…
Peer Reviews
Decision·ICLR 2026 Poster
* Problem is socially relevant: privacy and location disclosure are important deployment issues. * Benchmark and labeling are carefully designed * Evaluation covers many models and prompt styles, producing a clear quantitative picture. * Writing and visuals are clear
The paper presumes that location disclosure is inherently undesirable and that the visual context alone suffices to infer disclosure appropriateness. In practice, location sharing on social media is often strategically self-disclosing—users may intentionally reveal or ambiguously hint at places for social signaling, identity performance, or prestige. Without modeling user intent, audience, or platform norms, the proposed notion of “contextual integrity violation” collapses into a moralized prior
- Important and timely problem. - Sound methodology for constructing the benchmark. Especially appreciated are the efforts made to calibrate the labels and labeling questions well. - I believe the benchmark will enable targeted research on improving CI for VLMs. - Interesting and sound evaluation. Promising results with few-shot prompting. Kudos for evaluating different levels of adversaries for location inference.
- The paper focuses solely on geolocation, which is easy to evaluate. However, as shown by prior work [1] (btw a missing relevant citation in this paper) VLMs are capable of inferring other private attributes from images as well, such as sex, age, or income. - This is maybe half a question: The paper currently defines the appropriate privacy context for each image according to global guidelines. However, in practice I could imagine that a user might not intend to share their location through a g
1. Timely and important problem: contextualized privacy in image geolocation is underexplored yet high-impact for deployment safety. 2. Evaluates a diverse set of VLMs under vanilla, chain-of-thought, and adversarial setups, yielding informative failure patterns. 3. Generally well-written with transparent descriptions of proposed dataset and metrics, making the study easy to follow.
1. Insufficient baselines weaken claims of “strong geolocation.” The paper compares only across VLMs; it lacks head-to-head evaluation against dedicated geolocation systems (e.g., retrieval-based pipelines) on the same test set. I suggest that the authors can add specialized geolocation and classical CV baselines [R1][R2], or report directly comparable numbers from prior work with a careful discussion of any differences. [R1] Ma, Wanlun, et al. "LocGuard: A location privacy defender for image s
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Privacy, Security, and Data Protection · Ethics and Social Impacts of AI
