NeurIPS2020

Targeted Adversarial Perturbations for Monocular Depth Prediction

Alex Wong, Safa Cicek, Stefano Soatto

55 citations

Abstract

We study the effect of adversarial perturbations on the task of monocular depth prediction. Specifically, we explore the ability of small, imperceptible additive perturbations to selectively alter the perceived geometry of the scene. We show that such perturbations can not only globally re-scale the predicted distances from the camera, but also alter the prediction to match a different target scene. We also show that, when given semantic or instance information, perturbations can fool the network to alter the depth of specific categories or instances in the scene, and even remove them while preserving the rest of the scene. To understand the effect of targeted perturbations, we conduct experiments on state-of-the-art monocular depth prediction methods. Our experiments reveal vulnerabilities in monocular depth prediction networks, and shed light on the biases and context learned by them. Figure 1 : Altering the predicted scene with adversarial perturbations. Top to bottom: input image; adversarial perturbations with upper norm of 2 × 10 -2 ; predicted scene visualized as disparity. Left to right: original image and predicted scene; overall scene altered to be 10% closer; all vehicles altered to be 10% closer; vehicle in the center of the road is removed by perturbations. Recently, supervisory trends shifted to unsupervised (self-supervised) learning, which relies on stereo-pairs or video sequences during training, and provides supervision in the form of image reconstruction. While depth from video-based methods is up to an unknown scale, stereo-based methods can predict depth in metric scale because the pose (baseline) between the cameras is known. To learn depth from stereo-pairs, [13] predicted disparity by reconstructing one image from its stereo-counterpart. Monodepth [15] predicted both left and right disparities from a single image