ICML2020

LTF: A Label Transformation Framework for Correcting Label Shift

Jiaxian Guo, Mingming Gong, Tongliang Liu, Kun Zhang, Dacheng Tao

43 citations

Abstract

Distribution shift is a major obstacle to the deployment of current deep learning models on realworld problems. Let Y be the target (label) and X the predictors (features). We focus on one type of distribution shift, target shift, where the marginal distribution of the target variable P Y changes, but the conditional distribution P X|Y does not. Existing methods estimate the density ratio between the source-and target-domain label distributions by density matching. However, these methods are either computationally infeasible for large-scale data or restricted to shift correction for discrete labels. In this paper, we propose an end-to-end Label Transformation Framework (LTF) for correcting target shift, which implicitly models the shift of P Y and the conditional distribution P X|Y using neural networks. Thanks to the flexibility of deep networks, our framework can handle continuous, discrete, and even multidimensional labels in a unified way and is scalable to big data. Moreover, for high dimensional X, such as images, we find that the redundant information in X severely degrades the estimation accuracy. To remedy this issue, we propose to match the distribution implied by our generative model and the target-domain distribution in a low-dimensional feature space that discards information irrelevant to Y . Both theoretical and empirical studies demonstrate the superiority of our method over previous approaches.