NeurIPS2023

Invariant Anomaly Detection under Distribution Shifts: A Causal Perspective

João B. S. Carvalho, Mengtao Zhang, Robin Geyer, Carlos Cotrini, Joachim M. Buhmann

被引用 13 次

摘要

Anomaly detection (AD) is the machine learning task of identifying highly discrepant abnormal samples by solely relying on the consistency of the normal training samples. Under the constraints of a distribution shift, the assumption that training samples and test samples are drawn from the same distribution breaks down. In this work, by leveraging tools from causal inference we attempt to increase the resilience of anomaly detection models to different kinds of distribution shifts. We begin by elucidating a simple yet necessary statistical property that ensures invariant representations, which is critical for robust AD under both domain and covariate shifts. From this property, we derive a regularization term which, when minimized, leads to partial distribution invariance across environments. Through extensive experimental evaluation on both synthetic and real-world tasks, covering a range of six different AD methods, we demonstrated significant improvements in out-ofdistribution performance. Under both covariate and domain shift, models regularized with our proposed term showed marked increased robustness. Code is available at: https://github.com/JoaoCarv/invariant-anomaly-detection . 1 Introduction Anomaly detectors are the subject of increased interest in fields such as finance (Ahmed et al. [2016], Hilal et al. [2022]), medicine (Schlegl et al. [2019]), and security (Mothukuri et al. [2021], Siddiqui et al. [2019], Hosseinzadeh et al. [2021] ). Having been trained on a sample from an unknown distribution, these models are capable of identifying abnormal objects unlikely to come from the original distribution (Bishop and Nasrabadi [2006] ). Anomaly detection (AD) stands apart from supervised classification as it does not involve anomalies during training, making it challenging to articulate a model that depicts the class of objects deemed as normal. AD as a field boasts a plethora of diverse methodologies (Ruff et al. [2021] ). Current detectors have demonstrated the advantage of approaches based on representation learning (Reiss and Hoshen [2021], Deng and Li [2022] ). In this context, an encoder maps objects to representations which capture the most distinctive features of an object. In addition, it strives to map the class of normal objects onto a subset characterized by a more regular shape, thereby rendering representations from abnormal samples easily identifiable by comparison. Central to this second goal is a notable vulnerability of representation learning-based methods: they hinge on the assumption of independent and identically distributed (i.i.d.) training and test data. This implies that normal samples in the training data are expected to be sampled identically in the test data, thereby being mapped to the same vicinity in the representation space -an assumption that is frequently violated in real-world scenarios (Koh et al. [2021] ). Indeed, distribution shifts in the context of AD present a unique challenge because it involves discerning two types of distribution shifts targeting the distribution of the normal objects, p n . Anomalies 37th Conference on Neural Information Processing Systems (NeurIPS 2023).