EMNLP2024

Mitigating Language Bias of LMMs in Social Intelligence Understanding with Virtual Counterfactual Calibration

Peng Chen, Xiao-Yu Guo, Yuan-Fang Li, Xiaowang Zhang, Zhiyong Feng

1 citation

Abstract

Social intelligence is essential for understanding complex human expressions and social interactions. While large multimodal models (LMMs) have demonstrated remarkable performance in social intelligence question answering (SIQA), they are still inclined to generate responses relying on language priors and ignoring the relevant context due to the dominant prevalence of text-based data in the pretraining stage. To interpret the aforementioned language bias of LMMs, we employ a structure causal model and posit that counterfactual reasoning can mitigate the bias by avoiding spurious correlations between LMMs' internal commonsense knowledge and the given context. However, it is costly and challenging to construct multimodal counterfactual samples. To tackle the above challenges, we propose an output Distribution Calibration network with Virtual Counterfactual (DCVC) data augmentation framework. DCVC devises a novel output distribution calibration network to mitigate the impact of negative language biases while preserving beneficial priors. Perturbations are introduced to the output distributions of LMMs to simulate the distribution shifts from counterfactual manipulations of the context, which is employed to construct counterfactual augmented data virtually. Experiments on multiple datasets demonstrate the effectiveness and generalizability of our proposed method.