WWW2026

Visual Content Moderation in Messaging Systems

Maria Ljubicic, Emanuel Lacic, Denis Helic

Abstract

The widespread use of Multimedia Messaging Service (MMS) has led to a significant increase in the circulation of malicious visual content, presenting new challenges for scalable content moderation systems. In this work, we address the problem of visual spam detection in MMS by introducing a domain-specific taxonomy of inappropriate image categories. Based on this taxonomy, we construct a balanced training dataset from publicly available image collections, and two additional evaluation benchmarks derived from real-world MMS messages, which to the best of our knowledge are not covered by existing public datasets. All datasets were verified and manually labeled in order to ensure high annotation quality in line with our taxonomy. Furthermore, we show how to efficiently classify across eight categories related to MMS spam using an adapted CLIP-based architecture. Our empirical evaluation demonstrates that a fine-tuned CLIP model achieves strong accuracy that closely matches the performance of GPT-4o, but at a significantly lower cost which is crucial when performing at scale.