EMNLP2025

BANMIME : Misogyny Detection with Metaphor Explanation on Bangla Memes

Md Ayon Mia, Akm Moshiur Rahman Mazumder, Khadiza Sultana Sayma, Md Fahim, Md. Tahmid Hasan Fuad, Muhammad Ibrahim Khan, AKMMahbubur Rahman

摘要

Detecting misogyny in multimodal content remains a notable challenge, particularly in culturally conservative and low-resource contexts like Bangladesh. While existing research has explored hate speech and general meme classification, the nuanced identification of misogyny in Bangla memes, rich in metaphor, humor, and visual-textual interplay, remains severely underexplored. To address this gap, we introduce BANMIME , the first comprehensive Bangla misogynistic meme dataset comprising 2,000 culturally grounded samples where each meme includes misogyny labels, humor categories, metaphor localization, and detailed human-written explanations. We benchmark the various performances of open and closedsource vision-language models (VLMs) under zero-shot and prompt-based settings and evaluate their capacity for both classification and explanation generation. Furthermore, we systematically explore multiple fine-tuning strategies, including standard, data-augmented, and Chain-of-Thought (CoT) supervision. Our results demonstrate that CoT-based fine-tuning consistently enhances model performance, both in terms of accuracy and in generating meaningful explanations. We envision BANMIME as a foundational resource for advancing explainable multimodal moderation systems in lowresource and culturally sensitive settings. The code and dataset are publicly available at https://github.com/Ayon128/BANMIME . Disclaimer: This paper contains elements that one might find offensive which cannot be avoided due to the nature of the work.