ICML2025

Improving the Variance of Differentially Private Randomized Experiments through Clustering

Adel Javanmard, Vahab Mirrokni, Jean Pouget-Abadie

Abstract

Estimating causal effects from randomized experiments is only possible if participants are willing to disclose their potentially sensitive responses. Differential privacy, a widely used framework for ensuring an algorithm's privacy guarantees, can encourage participants to share their responses without the risk of de-anonymization. However, many mechanisms achieve differential privacy by adding noise to the original dataset, which reduces the precision of causal effect estimation. This introduces a fundamental trade-off between privacy and variance when performing causal analyses on differentially private data. In this work, we propose a new differentially private mechanism, CLUSTER-DP, which leverages a given cluster structure in the data to improve the privacyvariance trade-off. While our results apply to any clustering, we demonstrate that selecting higherquality clusters-according to a quality metric we introduce-can decrease the variance penalty without compromising privacy guarantees. Finally, we evaluate the theoretical and empirical performance of our CLUSTER-DP algorithm on both real and simulated data, comparing it to common baselines, including two special cases of our algorithm: its unclustered version and a uniformprior version.