ICML2025

Leveraging Randomness in Model and Data Partitioning for Privacy Amplification

Andy Dong, Wei-Ning Chen, Ayfer Özgür

Abstract

We study how inherent randomness in the training process-where each sample (or client in federated learning) contributes to only a randomly selected portion of training-can be leveraged for privacy amplification. This includes (1) model partitioning, where a sample updates only a subset of the model parameters, and (2) data partitioning, where a sample participates in only a subset of training iterations. We apply our framework to model parallelism in federated learning, where each client updates a randomly selected subnetwork to reduce memory and computational overhead, and show that existing methods, e.g. model splitting or dropout, provide a significant privacy amplification gain not captured by previous privacy analysis techniques. Additionally, we introduce Balanced Iteration Subsampling, a new data partitioning method where each sample (or client) participates in a fixed number of training iterations. We show that this method yields similar or stronger privacy amplification than Poisson (i.i.d.) sampling of data (or clients). Our results demonstrate that randomness in the training process, which is structured rather than i.i.d. and interacts with data in complex ways, can be systematically leveraged for significant privacy amplification.