Privacy-Preserving Logistic Regression Training on Large Datasets

John Chiang

arXiv:2406.13221·cs.CR·April 7, 2025

Privacy-Preserving Logistic Regression Training on Large Datasets

John Chiang

PDF

Open Access 1 Repo

TL;DR

This paper introduces an efficient homomorphic encryption-based algorithm for privacy-preserving logistic regression on large datasets, utilizing a quadratic gradient for faster convergence and demonstrating practical feasibility on real financial data.

Contribution

It proposes a mini-batch homomorphic logistic regression training algorithm using quadratic gradient, improving efficiency for large encrypted datasets.

Findings

01

Mini-batch algorithm outperforms full-batch in speed.

02

Practical training on large encrypted datasets demonstrated.

03

Quadratic gradient accelerates convergence.

Abstract

Privacy-preserving machine learning is one class of cryptographic methods that aim to analyze private and sensitive data while keeping privacy, such as homomorphic logistic regression training over large encrypted data. In this paper, we propose an efficient algorithm for logistic regression training on large encrypted data using Homomorphic Encryption (HE), which is the mini-batch version of recent methods using a faster gradient variant called $quadratic gradient$ . It is claimed that $quadratic gradient$ can integrate curve information (Hessian matrix) into the gradient and therefore can effectively accelerate the first-order gradient (descent) algorithms. We also implement the full-batch version of their method when the encrypted dataset is so large that it has to be encrypted in the mini-batch manner. We compare our mini-batch algorithm with our full-batch…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

petitioner/he.lr
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsPrivacy-Preserving Technologies in Data