Bypassing the ambient dimension: Private sgd with gradient subspace identification [preprint]

Preprint date

July 7, 2020

Authors

Yingxue Zhou (Ph.D. student), Zhiwei Steven Wu (adjunct assistant professor), Arindam Banerjee (adjunct professor)

Abstract

Differentially private SGD (DP-SGD) is one of the most popular methods for solving differentially private empirical risk minimization (ERM). Due to its noisy perturbation on each gradient update, the error rate of DP-SGD scales with the ambient dimension , the number of parameters in the model. Such dependence can be problematic for over-parameterized models where , the number of training samples. Existing lower bounds on private ERM show that such dependence on is inevitable in the worst case. In this paper, we circumvent the dependence on the ambient dimension by leveraging a low-dimensional structure of gradient space in deep networks---that is, the stochastic gradients for deep nets usually stay in a low dimensional subspace in the training process. We propose Projected DP-SGD that performs noise reduction by projecting the noisy gradients to a low-dimensional subspace, which is given by the top gradient eigenspace on a small public dataset. We provide a general sample complexity analysis on the public dataset for the gradient subspace identification problem and demonstrate that under certain low-dimensional assumptions the public sample complexity only grows logarithmically in . Finally, we provide a theoretical analysis and empirical evaluations to show that our method can substantially improve the accuracy of DP-SGD.

Link to full paper

Bypassing the ambient dimension: Private sgd with gradient subspace identification

Keywords

machine learning, cryptography, security