AAAI2023

Tackling Data Heterogeneity in Federated Learning with Class Prototypes

Yutong Dai, Zeyuan Chen, Junnan Li, Shelby Heinecke, Lichao Sun, Ran Xu

154 citations

Abstract

Introduction Federated learning (FL) [1] is an emerging area that attracts significant interest in the machine learning community due to its capability to allow collaborative learning from decentralized data with privacy protection. However, in FL, clients may have different data distributions, which violates the standard independent and identically distribution (i.i.d) assumption in centralized machine learning. The non-i.i.d phenomenon is known as the data heterogeneity issue and is an acknowledged cause of the performance degradation of the global model [2] . Moreover, from the client's perspective, the global model may not be the best for their tasks. Therefore, personalized federated learning (PFL) emerged as a variant of FL, where personalized models are learned from a combination of the global model and local data to best suit client tasks. While PFL methods address data heterogeneity, class imbalance combined with data heterogeneity remains overlooked. Class imbalance occurs when clients' data consists of different class distributions and the client may not possess samples of a particular class at all. Ideally, the personalized model can perform equally well in all classes that appeared in the local training dataset. For example, medical institutions have different distributions of medical records across diseases [3], and it is crucial that the personalized model can detect local diseases with equal precision. Meanwhile, the currently adopted practice of evaluating the effectiveness of PFL methods can also be biased. Specifically, when evaluating the accuracy, a single balanced testing dataset is split into multiple local testing datasets that match clients' training data distributions. Then each personalized model is tested on the local testing dataset, and the averaged accuracy is reported. However, in the presence of class imbalance, such an evaluation protocol will likely give a biased assessment due to the potential overfitting of the dominant classes. It is tempting to borrow techniques developed for centralized class imbalance learning, like re-sampling or re-weighting the minority classes. However, due to the data heterogeneity in the FL setting, different clients might have different dominant classes and even have different missing classes; hence the direct adoption may not be applicable. Furthermore, re-sampling would require the knowledge of all classes, potentially violating the privacy constraints. Recent works in class imbalanced learning in non-FL settings [4, 5] suggest decoupling the training procedure into the representation learning and classification phases. The representation learning phase aims to build high-quality representations for classification, while the classification phase seeks to balance the decision boundaries among dominant classes and minority classes. Interestingly, FL works such as [6, 7] find that the classifier is the cause of performance drop and suggest that learning strong shared representations can boost performance. Consistent with the findings in prior works, as later shown in Figure 1 , we observe that representations for different classes are uniformly distributed over the representation space and cluster around the class prototype when learned with class-balanced datasets. However, when the training set is class-imbalanced, as is the case for different clients, representations of minority classes overlap with those of majority classes; hence, the representations are of low quality. Motivated by these observations, we propose FedNH (non-parametric head), a novel method that imposes uniformity of the representation space and preserves class semantics to address data heterogeneity with imbalanced classes. We initially distribute class prototypes uniformly in the latent space as an inductive bias to improve the quality of learned representations and smoothly infuse the class semantics into class prototypes to improve the performance of classifiers on local tasks. Our contributions are summarized as follows. • We propose FedNH, a novel method that tackles data heterogeneity with class imbalance by utilizing uniformity and semantics of class prototypes. • We design a new metric to evaluate personalized model performance. This metric is less sensitive to class imbalance and reflects personalized model generalization ability on minority classes.