CVPR2021

Reducing Domain Gap by Reducing Style Bias

Hyeonseob Nam, HyunJae Lee, Jongchan Park, Wonjun Yoon, Donggeun Yoo

Abstract

Convolutional Neural Networks (CNNs) often fail to maintain their performance when they confront new test domains, which is known as the problem of domain shift. Recent studies suggest that one of the main causes of this problem is CNNs' strong inductive bias towards image styles (i.e. textures) which are sensitive to domain changes, rather than contents (i.e. shapes). Inspired by this, we propose to reduce the intrinsic style bias of CNNs to close the gap between domains. Our Style-Agnostic Networks (SagNets) disentangle style encodings from class categories to prevent style biased predictions and focus more on the contents. Extensive experiments show that our method effectively reduces the style bias and makes the model more robust under domain shift. It achieves remarkable performance improvements in a wide range of cross-domain tasks including domain generalization, unsupervised domain adaptation, and semi-supervised domain adaptation on multiple datasets. 1 * Equal contribution 1 Code: https://github.com/hyeonseobnam/sagnet Similarly, people can easily recognize objects in cartoons or paintings even if they have not seen the same style of an image before. Where does such a difference come from? A recent line of studies has revealed that standard CNNs have an inductive bias far different from human vision: while humans tend to recognize objects based on their contents (i.e. shapes) [27] , CNNs exhibit a strong bias towards styles (i.e. textures) [1, 14, 21] . This may explain why CNNs are intrinsically more sensitive to domain shift because image styles are more likely to change across domains than the contents. Geirhos et al. [14] supported this hypothesis by showing that CNNs trained with heavy augmentation on styles become more robust against various image distortions. Research on CNN architectures [40, 28] has also demonstrated that adjusting the style information in CNNs helps to address multi-domain tasks. In this paper, we experimentally analyze the relation between CNNs' inductive bias and representation gap across domains, and exploit this relation to address domain shift problems. We propose Style-Agnostic Networks (SagNets) which effectively improve CNNs' domain transferability by controlling their inductive bias, without directly reducing domain discrepancy. Our framework consists of separate content-biased and style-biased networks on top of a feature extractor. The content-biased network is encouraged to focus on contents by randomizing styles in a latent space. The style-biased network is led to focus on styles in the opposite way, against which the feature extractor adversarially makes the styles incapable of discriminating class categories. At test time, the prediction is made by the combination of the feature extractor and the content-biased network, where the style bias is substantially reduced. We show that there exists an apparent correlation between CNNs' inductive bias and their ability to handle domain shift: reducing style bias reduces domain discrepancy. Based on this property, SagNets make significant improvements in a wide range of domain shift scenarios including DG, UDA, and SSDA, across several cross-domain benchmarks such as PACS [29], Office-Home [47], and DomainNet [42]. Our method is orthogonal to the majority of existing domain adaptation and generalization techniques that utilize