GDPR-Bench-Android: A Benchmark for Evaluating Automated GDPR Compliance Detection in Android
Huaijin Ran, Haoyi Zhang, Xunzhu Tang

TL;DR
GDPR-Bench-Android is a comprehensive benchmark dataset and evaluation framework for assessing automated GDPR compliance detection methods in Android source code, covering multiple granularities and including diverse approaches.
Contribution
It introduces the first extensive benchmark with annotated violations, a formal method baseline, and evaluates 11 diverse methods, revealing their strengths and limitations.
Findings
No single method excels across all tasks.
ReAct agent performs best at file-level localization.
Qwen2.5-72B LLM excels at line-level detection.
Abstract
Automating the detection of EU General Data Protection Regulation (GDPR) violations in source code is a critical but underexplored challenge. We introduce \textbf{GDPR-Bench-Android}, the first comprehensive benchmark for evaluating diverse automated methods for GDPR compliance detection in Android applications. It contains \textbf{1951} manually annotated violation instances from \textbf{15} open-source repositories, covering 23 GDPR articles at file-, module-, and line-level granularities. To enable a multi-paradigm evaluation, we contribute \textbf{Formal-AST}, a novel, source-code-native formal method that serves as a deterministic baseline. We define two tasks: (1) \emph{multi-granularity violation localization}, evaluated via Accuracy@\textit{k}; and (2) \emph{snippet-level multi-label classification}, assessed by macro-F1 and other classification metrics. We benchmark 11 methods,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Malware Detection Techniques · Privacy, Security, and Data Protection · Software Engineering Research
