Hybrid Privacy Policy-Code Consistency Check using Knowledge Graphs and LLMs
Zhenyu Mao, Xinxin Fan, Yifei Wang, Jacky Keung, Jialong Li

TL;DR
This paper presents a hybrid method combining knowledge graphs and LLMs to improve privacy policy-code consistency checks, achieving higher accuracy and efficiency in detecting privacy violations.
Contribution
It introduces a novel hybrid approach that leverages deterministic knowledge graphs and LLMs for more accurate and cost-effective privacy policy verification.
Findings
37.63% increase in precision
23.13% increase in F1-score
93.5% reduction in token consumption
Abstract
The increasing concern in user privacy misuse has accelerated research into checking consistencies between smartphone apps' declared privacy policies and their actual behaviors. Recent advances in Large Language Models (LLMs) have introduced promising techniques for semantic comparison, but these methods often suffer from low accuracies and expensive computational costs. To address this problem, this paper proposes a novel hybrid approach that integrates 1) knowledge graph-based deterministic checking to ensure higher accuracy, and 2) LLMs exclusively used for preliminary semantic analysis to save computational costs. Preliminary evaluation indicates this hybrid approach not only achieves 37.63% increase in precision and 23.13% increase F1-score but also consumes 93.5% less tokens and 87.3% shorter time.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Malware Detection Techniques · Privacy, Security, and Data Protection · Hate Speech and Cyberbullying Detection
