Words Blending Boxes. Obfuscating Queries in Information Retrieval using Differential Privacy
Francesco Luigi De Faveri, Guglielmo Faggioli, Nicola Ferro

TL;DR
This paper introduces Word Blending Boxes, a novel differentially private mechanism for obfuscating search queries, effectively balancing user privacy protection with retrieval accuracy in information retrieval systems.
Contribution
The paper presents a new privacy mechanism that enhances query obfuscation by using safe boxes, addressing limitations of existing differential privacy approaches in NLP-based IR.
Findings
WBB effectively protects user privacy in queries.
WBB maintains high relevance in document retrieval.
The mechanism is practical for integration into existing IR systems.
Abstract
Ensuring the effectiveness of search queries while protecting user privacy remains an open issue. When an Information Retrieval System (IRS) does not protect the privacy of its users, sensitive information may be disclosed through the queries sent to the system. Recent improvements, especially in NLP, have shown the potential of using Differential Privacy to obfuscate texts while maintaining satisfactory effectiveness. However, such approaches may protect the user's privacy only from a theoretical perspective while, in practice, the real user's information need can still be inferred if perturbed terms are too semantically similar to the original ones. We overcome such limitations by proposing Word Blending Boxes, a novel differentially private mechanism for query obfuscation, which protects the words in the user queries by employing safe boxes. To measure the overall effectiveness of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPrivacy-Preserving Technologies in Data · Cryptography and Data Security · Internet Traffic Analysis and Secure E-voting
