EMNLP2025
BANMIME : Misogyny Detection with Metaphor Explanation on Bangla Memes
Md Ayon Mia, Akm Moshiur Rahman Mazumder, Khadiza Sultana Sayma, Md Fahim, Md. Tahmid Hasan Fuad, Muhammad Ibrahim Khan, AKMMahbubur Rahman
摘要
Detecting misogyny in multimodal content remains a notable challenge, particularly in culturally conservative and low-resource contexts like Bangladesh. While existing research has explored hate speech and general meme classification, the nuanced identification of misogyny in Bangla memes, rich in metaphor, humor, and visual-textual interplay, remains severely underexplored. To address this gap, we introduce BANMIME , the first comprehensive Bangla misogynistic meme dataset comprising 2,000 culturally grounded samples where each meme includes misogyny labels, humor categories, metaphor localization, and detailed human-written explanations. We benchmark the various performances of open and closedsource vision-language models (VLMs) under zero-shot and prompt-based settings and evaluate their capacity for both classification and explanation generation. Furthermore, we systematically explore multiple fine-tuning strategies, including standard, data-augmented, and Chain-of-Thought (CoT) supervision. Our results demonstrate that CoT-based fine-tuning consistently enhances model performance, both in terms of accuracy and in generating meaningful explanations. We envision BANMIME as a foundational resource for advancing explainable multimodal moderation systems in lowresource and culturally sensitive settings. The code and dataset are publicly available at https://github.com/Ayon128/BANMIME . Disclaimer: This paper contains elements that one might find offensive which cannot be avoided due to the nature of the work.