Fast Privacy-Preserving Text Classification based on Secure Multiparty   Computation

Amanda Resende; Davis Railsback; Rafael Dowsley; Anderson C. A.; Nascimento; Diego F. Aranha

arXiv:2101.07365·cs.CR·June 9, 2021

Fast Privacy-Preserving Text Classification based on Secure Multiparty Computation

Amanda Resende, Davis Railsback, Rafael Dowsley, Anderson C. A., Nascimento, Diego F. Aranha

PDF

TL;DR

This paper introduces a fast, privacy-preserving Naive Bayes text classifier using Secure Multiparty Computation, enabling private spam detection with high efficiency and minimal information leakage.

Contribution

It presents a novel, secure Naive Bayes classifier implementation based on SMC, optimized for speed and applied to private text classification tasks.

Findings

01

Classifies SMS as spam or ham in under 340ms for large models.

02

Achieves 21ms classification time for smaller, typical spam messages.

03

Provides a secure, efficient solution adaptable to various text classification scenarios.

Abstract

We propose a privacy-preserving Naive Bayes classifier and apply it to the problem of private text classification. In this setting, a party (Alice) holds a text message, while another party (Bob) holds a classifier. At the end of the protocol, Alice will only learn the result of the classifier applied to her text input and Bob learns nothing. Our solution is based on Secure Multiparty Computation (SMC). Our Rust implementation provides a fast and secure solution for the classification of unstructured text. Applying our solution to the case of spam detection (the solution is generic, and can be used in any other scenario in which the Naive Bayes classifier can be employed), we can classify an SMS as spam or ham in less than 340ms in the case where the dictionary size of Bob's model includes all words (n = 5200) and Alice's SMS has at most m = 160 unigrams. In the case with n = 369 and m…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.