NeurIPS2020

Most ReLU Networks Suffer from 2\ell2 Adversarial Perturbations

Amit Daniely, Hadas Shacham

被引用 17 次

摘要

We consider ReLU networks with random weights, in which the dimension decreases at each layer. We show that for most such networks, most examples xx admit an adversarial perturbation at an Euclidean distance of O(xd)O\left(\frac{\|x\|}{\sqrt{d}}\right), where dd is the input dimension. Moreover, this perturbation can be found via gradient flow, as well as gradient descent with sufficiently small steps. This result can be seen as an explanation to the abundance of adversarial examples, and to the fact that they are found via gradient descent.