AAAI2025
Differential Private Stochastic Optimization with Heavy-tailed Data: Towards Optimal Rates
Puning Zhao, Jiafei Wu, Zhe Liu, Chong Wang, Rongfei Fan, Qingming Li
1 citation
Abstract
We study convex optimization problems under differential privacy (DP). With heavy-tailed gradients, existing works achieve suboptimal rates. The main obstacle is that existing gradient estimators have suboptimal tail properties, resulting in a superfluous factor of d in the union bound. In this paper, we explore algorithms achieving optimal rates of DP optimization with heavy-tailed gradients. Our first method is a simple clipping approach. Under bounded p-th order moments of gradients, with n samples, it achieves We then propose an iterative updating method, which is more complex but achieves this rate for all ǫ ≤ 1. The results significantly improve over existing methods. Such improvement relies on a careful treatment of the tail behavior of gradient estimators. Our results match the minimax lower bound in [1], indicating that the theoretical limit of stochastic convex optimization under DP is achievable. Source Bound of risk Comparison of risk bounds of stochastic optimization under (ǫ, δ)-DP with p-th order bounded moments on gradients. Logarithmic factors are omitted here. The remaining drawback is that this method has an additional term d 2) Iterative updating. This method is proposed to remove the additional term of the simple clipping method. It divides the data into k groups. For each group, this method calculates the group-wise mean and adds noise to meet DP requirements. After that, the mean estimate is iteratively updated based on the estimation of distances and directions to the ground truth ∇F (w t ). Such design is inspired by several existing methods for non-private mean estimation with heavy-tailed data [19] [20] [21] [22] . Compared with the simple clipping approach, this method improves the tail behavior of the mean estimator from subexponential to subgaussian. Moreover, this method is invariant to permutations of groups. As a result, the overall privacy of the final estimate is amplified compared with the privacy of each group [23, 24] . With this