USENIX Security2026
VSG-Safe: Spotting NSFW Video through Cross-Frame Evidence
Yuyang Zhang, Xudong Jiang, Yuxuan Song, Yuxiang Sun, Yihao Huang, Run Wang, Shundi Xiao, Lina Wang
摘要
Recent advances in text-to-video (T2V) models enable highfidelity videos that closely follow textual prompts. However, this expands practical applications while amplifying serious security and societal concerns from the automated synthesis of visual content that may be inappropriate in certain usage contexts, such as public or workplace settings, including sexual or violent content (e.g., the Grok can generate sexual videos in the "Spicy" mode). We observe that such visual content is often distributed across frames, embedded in visual entities, their attributes, and inter-entity relations. In contrast, existing moderation pipelines primarily treat video content as either individual frames or raw frame sequences, overlooking the fact that critical semantics can manifest through the combination of specific frames. This gap prevents them from reasoning across frames, confining detection to low-level visual cues, such as gore or explicit conflict, and causing frequent failures when cross-frame inference is required, including illegal activities or threats. To address these limitations, we propose leveraging scene graphs as the core intermediate semantic representation. Scene graphs naturally encode entities, their attributes, and inter-entity relationships, while also supporting reasoning over cross-frame content. Grounded on this insight, we further propose VSG-Safe, a novel scene-graph-driven framework for T2V content moderation. Concretely, our approach first extracts cross-frame content from videos to build scene graphs. With these graphs, we leverage a graph-oriented model to jointly capture entities, attributes, and inter-entity relations, enabling effective detection. To evaluate its effectiveness, we conduct extensive experiments on both SOTA benchmarks and our self-constructed video datasets. VSG-Safe attains an average F1-score of 97.62%, outperforming seven baselines by 42.32% on average. Disclaimer: This paper contains visual content that might be offensive to some readers, such as sexual and violent content. Although we censor and mask Not-Safe-for-Work (NSFW) imagery, reader discretion is advised.