AAAI2023

UCoL: Unsupervised Learning of Discriminative Facial Representations via Uncertainty-Aware Contrast

Hao Wang, Min Li, Yangyang Song, Youjian Zhang, Liying Chi

4 citations

Abstract

This paper investigates unsupervised representation learning for facial expression analysis. We think Unsupervised Facial Expression Representation (UFER) deserves exploration and has the potential to benefit facial expression analysis regarding some critical problems, e.g. scaling, annotation bias, the gap between discrete annotations and continuous emotion expressions, and model pre-training. Such motivated, we propose a UFER method with contrastive local warping (ContraWarping), which leverages the insight that the emotional expression is robust to current global transformation (affine transformation, color jitter, etc.) but can be easily changed by random local warping. Therefore, given a facial image, ContraWarping employs some global transformations and local warping to generate its positive and negative samples and sets up a novel contrastive learning framework. Our in-depth investigation shows that: 1) the positive pairs from global transformations may be exploited with general self-supervised learning (e.g. BYOL) and already bring some informative features, and 2) the negative pairs from local warping explicitly introduce expression-related variation and further bring substantial improvement. Based on ContraWarping, we demonstrate the benefit of UFER under two facial expression analysis scenarios: facial expression recognition and image retrieval. For example, directly using ContraWarping features for linear probing achieves 79.95% accuracy on RAF-DB, significantly reducing the gap towards the full-supervised counterpart (89.18% / 84.81% with/without pre-training).