CCS2024
Watch Out! Simple Horizontal Class Backdoor Can Trivially Evade Defense
Hua Ma, Shang Wang, Yansong Gao, Zhi Zhang, Huming Qiu, Minhui Xue, Alsharif Abuadbba, Anmin Fu, Surya Nepal, Derek Abbott
7 citations
Abstract
All current backdoor attacks on deep learning (DL) models fall under the category of a vertical class backdoor (VCB)-class-dependent. In VCB attacks, any sample from a class activates the implanted backdoor when the secret trigger is present, regardless of whether it is a sub-type source-class-agnostic backdoor or a source-classspecific backdoor. For example, a trigger of sunglasses can mislead a facial recognition model into administrator prediction when any people (source-class-agnostic) or a specific group of people (source-class-specific) wear sunglasses. Existing defense strategies overwhelmingly focus on countering VCB attacks, especially those that are source-class-agnostic. This narrow focus neglects the potential threat of other simpler yet general backdoor types, leading to false security implications. It is, therefore, crucial to discover and elucidate unknown backdoor types, particularly those that can be easily implemented, as a mandatory step before developing countermeasures. This study introduces a new, simple, and general type of backdoor attack coined as the horizontal class backdoor (HCB) that trivially breaches the class dependence characteristic of the VCB, bringing a fresh perspective to the community. HCB is now activated when the trigger is presented together with an innocuous feature, regardless of class. For example, the facial recognition model misclassifies a person who wears sunglasses with a smiling innocuous feature into the targeted person, such as an administrator, regardless of which person. Smiling is innocuous because it is irrelevant to the main task of facial recognition. The key is that these innocuous features (such as rain, fog, or snow in autonomous driving or facial expressions like smiling or sadness in facial recognition) are horizontally shared among classes but are