CVPR2023

Hierarchical Semantic Contrast for Scene-aware Video Anomaly Detection

Shengyang Sun, Xiaojin Gong

Abstract

Increasing scene-awareness is a key challenge in video anomaly detection (VAD). In this work, we propose a hierarchical semantic contrast (HSC) method to learn a sceneaware VAD model from normal videos. We first incorporate foreground object and background scene features with highlevel semantics by taking advantage of pre-trained video parsing models. Then, building upon the autoencoderbased reconstruction framework, we introduce both scenelevel and object-level contrastive learning to enforce the encoded latent features to be compact within the same semantic classes while being separable across different classes. This hierarchical semantic contrast strategy helps to deal with the diversity of normal patterns and also increases their discrimination ability. Moreover, for the sake of tackling rare normal activities, we design a skeleton-based motion augmentation to increase samples and refine the model further. Extensive experiments on three public datasets and scene-dependent mixture datasets validate the effectiveness of our proposed method.