PrivBasis: Frequent Itemset Mining with Differential Privacy
Ninghui Li, Wahbeh Qardaji, Dong Su, Jianneng Cao

TL;DR
PrivBasis is a novel method for differentially private frequent itemset mining that uses basis sets to improve accuracy and privacy guarantees, outperforming existing techniques.
Contribution
We introduce PrivBasis, a new approach leveraging basis sets to efficiently perform differentially private frequent itemset mining.
Findings
Outperforms current state-of-the-art methods
Effectively balances privacy and utility
Demonstrates scalability on large datasets
Abstract
The discovery of frequent itemsets can serve valuable economic and research purposes. Releasing discovered frequent itemsets, however, presents privacy challenges. In this paper, we study the problem of how to perform frequent itemset mining on transaction databases while satisfying differential privacy. We propose an approach, called PrivBasis, which leverages a novel notion called basis sets. A theta-basis set has the property that any itemset with frequency higher than theta is a subset of some basis. We introduce algorithms for privately constructing a basis set and then using it to find the most frequent itemsets. Experiments show that our approach greatly outperforms the current state of the art.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPrivacy-Preserving Technologies in Data · Data Mining Algorithms and Applications · Imbalanced Data Classification Techniques
