WWW2025

Learning by Comparing: Boosting Multimodal Affective Computing through Ordinal Learning

Sijie Mai, Ying Zeng, Haifeng Hu

10 citations

Abstract

Previous studies on multimodal affective computing primarily focus on approximating predictions to annotated labels, often neglecting the ordinal nature of affective states. In this paper, we address this issue by exploring ordinal learning, and a Multimodal Ordinal Affective Computing (MOAC) framework is designed to enhance the understanding of the nature of affective concepts. Specifically, we propose coarse-grained label-level ordinal learning that prompts the model to learn to compare in the label space, encouraging higher predictive values for samples annotated with larger labels over those with smaller labels. Moreover, a regularization loss is proposed to prevent the output distributions from deviating significantly from the annotated label distributions. Fine-grained feature-level ordinal learning is then performed via the feature difference operation and the neutral embedding. The former compares samples in the feature space, calculating the difference between features of different samples to generate 'new' features for a more robust training. The latter seeks to reduce the difficulty of prediction by estimating the difference between the target multimodal representations and a neutral reference. We first demonstrate MOAC in multimodal sentiment analysis, which is a regression task that aligns well with the function of ordinal learning. Then we extend MOAC to classification tasks including multimodal humor detection and sarcasm detection to evaluate its generalizability. Experiments suggest that MOAC outperforms state-of-the-art methods.