CVPR2025
Just Dance with pi! A Poly-modal Inductor for Weakly-supervised Video Anomaly Detection
Snehashis Majhi, Giacomo D'Amicantonio, Antitza Dantcheva, Quan Kong, Lorenzo Garattoni, Gianpiero Francesca, Egor Bondarev, François Brémond
Abstract
Figure 1. a): Illustration of abnormal frames and respective multi-modal saliencies in complex real-world scenes. Optical flow captures distinct abnormal motion in "Abuse" and "Arrest", while depth and pose detect subtle movements that optical flow may miss. Panoptic masks and text provide overall scene context. b): Comparison of multi-modal methods with our PI-VAD. PI-VAD requires the five modalities only during training, significantly reducing computation and enabling real-world applicability.