NeurIPS2024

Error Correction Output Codes for Robust Neural Networks against Weight-errors: A Neural Tangent Kernel Point of View

Anlan Yu, Shusen Jing, Ning Lyu, Wujie Wen, Zhiyuan Yan

Abstract

Error correcting output code (ECOC) is a classic method that encodes binary classifiers to tackle the multi-class classification problem in decision trees and neural networks. Among ECOCs, the one-hot code has become the default choice in modern deep neural networks (DNNs) due to its simplicity in decision making. However, it suffers from a significant limitation in its ability to achieve high robust accuracy, particularly in the presence of weight-errors. While recent studies have experimentally demonstrated that the non-one-hot ECOCs with multi-bits error correction ability, could be a better solution, there is a notable absence of theoretical foundations that can elucidate the relationship between codeword design, weight-error magnitude, and network characteristics, so as to provide robustness guarantees. This work is positioned to bridge this gap through the lens of neural tangent kernel (NTK). We have two important theoretical findings: 1) In clean models (without weight-errors), utilizing one-hot code and non-one-hot ECOC is akin to altering decoding metrics from l 2 distance to Mahalanobis distance. 2) There exists a threshold, determined by the normalized distance among codewords, the DNN architecture, and the scale of weight-errors. If the distance between a clean output (without weight-errors) and its nearest codewords is smaller than this threshold, then the DNN can make predictions as if it is free of weight-errors. Based on these findings, we further demonstrate how to practically use them to identify optimal ECOCs for simple tasks (small number of classes) and complex tasks (large number of classes), by balancing the code orthogonality (as per finding 1) and code distance (as per finding 2). Extensive experimental results across four datasets and