CCS2025

VillainNet: Targeted Poisoning Attacks Against SuperNets Along the Accuracy-Latency Pareto Frontier

David Oygenblik, Abhinav Vemulapalli, Animesh Agrawal, Debopam Sanyal, Alexey Tumanov, Brendan Saltaformaggio

Abstract

State-of-the-art (SOTA) weight-shared SuperNets dynamically activate subnetworks at runtime, enabling robust adaptive inference under varying deployment conditions. However, we find that adversaries can take advantage of the unique training and inference paradigms of SuperNets to selectively implant backdoors that activate only within specific subnetworks, remaining dormant across billions of other subnetworks. We present VillainNet (VNET), a novel poisoning methodology that restricts backdoor activation to attacker-chosen subnetworks, tailored either to specific operational scenarios (e.g., specific vehicle speeds or weather conditions) or to specific subnetwork configurations. VNET's core innovation is a novel, distance-aware optimization process that leverages architectural and computational similarity metrics between subnetworks to ensure that backdoor activation does not occur across non-target subnetworks. This forces defenders to confront a dramatically expanded search space for backdoor detection. We show that across two SOTA SuperNets, trained on the CIFAR10 and GTSRB datasets, VNET can achieve attack success rates comparable to traditional poisoning approaches (approximately 99%), while significantly lowering the chances of attack detection, thereby stealthily hiding the attack. Consequently, defenders face increased computational burdens, requiring on average 66 (and up to 250 for highly targeted attacks) sampled subnetworks to detect the attack, implying a roughly 66-fold increase in compute cost required to test the SuperNet for backdoors.