ICLR2026

Near Optimal Robust Federated Learning Against Data Poisoning Attack

Jingfan Yu, Zhixuan Fang

Abstract

We revisit data poisoning attacks in the federated learning system. There will be $m$ worker nodes (each has $n$ training data samples) cooperatively training one model for a machine-learning task, and a fraction (i.e., $\alpha$ ) of the workers may suffer from the data poisoning attack. We mainly focus on the challenging and practical case where $n$ is small and $m$ is large, such that each worker does not have enough statistical information to identify the poisoned data by itself, while in total they have enough data to learn the task if the poisoned data are detected. Therefore, we propose a mechanism for workers to cooperatively detect workers with poisoned data. In terms of attack loss, our mechanism achieves $\tilde{O}((\frac{1}{n})^{\frac{1}{2}}+(\frac{d}{mn})^{\frac{1}{2}})$ in IID setting and $\tilde{O}((\frac{1}{\gamma})^{\frac{1}{2}}+(\frac{1}{n})^{\frac{1}{2}}+(\frac{d}{mn})^{\frac{1}{2}})$ in non-IID setting, where $d$ is the VC-dimension of the learning model and $\gamma$ is a concentration parameter characterizing the non-IIDness. Alongside attack loss, our mechanism limits the adversary’s free-ride gain even when it cannot be directly quantified by the attack loss. We also propose the lower bound of the attack loss, and our proposed algorithm matches the lower bound when $m\rightarrow \infty$ both in IID setting and non-IID setting.