Coupling public and private gradient provably helps optimization
Ruixuan Liu, Zhiqi Bu, Yu-xiang Wang, Sheng Zha, George Karypis

TL;DR
This paper demonstrates that coupling gradients from public and private data sources with an optimal weighted approach accelerates neural network training and improves accuracy, supported by theoretical analysis and empirical validation.
Contribution
It introduces a method to optimally combine public and private gradients, providing theoretical insights and practical guidelines for hyperparameter selection.
Findings
Gradient coupling accelerates convergence in non-convex optimization.
Optimal weighting depends on privacy budget, iterations, batch size, and model size.
Empirical results validate the theoretical benefits across language and vision tasks.
Abstract
The success of large neural networks is crucially determined by the availability of data. It has been observed that training only on a small amount of public data, or privately on the abundant private data can lead to undesirable degradation of accuracy. In this work, we leverage both private and public data to improve the optimization, by coupling their gradients via a weighted linear combination. We formulate an optimal solution for the optimal weight in the convex setting to indicate that the weighting coefficient should be hyperparameter-dependent. Then, we prove the acceleration in the convergence of non-convex loss and the effects of hyper-parameters such as privacy budget, number of iterations, batch size, and model size on the choice of the weighting coefficient. We support our analysis with empirical experiments across language and vision benchmarks, and provide a guideline for…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Privacy-Preserving Technologies in Data · Sparse and Compressive Sensing Techniques
