WWW2025

MDAM3: A Misinformation Detection and Analysis Framework for Multitype Multimodal Media

Qingzheng Xu, Heming Du, Szymon Lukasik, Tianqing Zhu, Sen Wang, Xin Yu

被引用 11 次

摘要

Misinformation is a significant societal issue with potentially severe consequences. It appears in text, image, audio, and video modalities, encompassing various categories such as unimodal deception (fact-conflicting, AI-generated & offensive content) and cross-modal inconsistencies. However, current detection approaches often focus on text and image, overlooking the growing prevalence of misinformation in audio and video content. Moreover, these methods typically tend to address only one or two types of misinformation, failing to address all categories simultaneously. These detectors are also usually designed to make judgments without providing explanations, reducing transparency and limiting their broader applicability. To address these issues, we propose MDAM3, a Misinformation Detection and Analysis Framework for Multitype Multimodal Media. MDAM3 analyzes each input in internal detection and examines relationships across modalities to identify inconsistencies. It utilizes web resources and integrates Large Vision-Language Models (LVLMs) to deliver accurate detection results along with detailed analysis. To evaluate MDAM3, we curate MDAM3-DB, a specialized multitype multimodal misinformation dataset. A user study is conducted to explore MDAM3's usability, interpretability, and effectiveness. We hope this research contributes to advancing misinformation detection methodologies and provides valuable insights for developing robust multimodal analysis tools.