Hybrid Privacy Policy-Code Consistency Check using Knowledge Graphs and LLMs

Zhenyu Mao; Xinxin Fan; Yifei Wang; Jacky Keung; Jialong Li

arXiv:2505.11502·cs.CR·May 20, 2025

Hybrid Privacy Policy-Code Consistency Check using Knowledge Graphs and LLMs

Zhenyu Mao, Xinxin Fan, Yifei Wang, Jacky Keung, Jialong Li

PDF

Open Access

TL;DR

This paper presents a hybrid method combining knowledge graphs and LLMs to improve privacy policy-code consistency checks, achieving higher accuracy and efficiency in detecting privacy violations.

Contribution

It introduces a novel hybrid approach that leverages deterministic knowledge graphs and LLMs for more accurate and cost-effective privacy policy verification.

Findings

01

37.63% increase in precision

02

23.13% increase in F1-score

03

93.5% reduction in token consumption

Abstract

The increasing concern in user privacy misuse has accelerated research into checking consistencies between smartphone apps' declared privacy policies and their actual behaviors. Recent advances in Large Language Models (LLMs) have introduced promising techniques for semantic comparison, but these methods often suffer from low accuracies and expensive computational costs. To address this problem, this paper proposes a novel hybrid approach that integrates 1) knowledge graph-based deterministic checking to ensure higher accuracy, and 2) LLMs exclusively used for preliminary semantic analysis to save computational costs. Preliminary evaluation indicates this hybrid approach not only achieves 37.63% increase in precision and 23.13% increase F1-score but also consumes 93.5% less tokens and 87.3% shorter time.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Malware Detection Techniques · Privacy, Security, and Data Protection · Hate Speech and Cyberbullying Detection