AAAI2025

pFedES: Generalized Proxy Feature Extractor Sharing for Model Heterogeneous Personalized Federated Learning

Liping Yi, Han Yu, Chao Ren, Gang Wang, Xiaoguang Liu, Xiaoxiao Li

8 citations

Abstract

Recent advances in personalized federated learning have focused on addressing client model heterogeneity. However, most existing methods still require external data, rely on model decoupling, or adopt partial learning strategies, which can limit their practicality and scalability. In this paper, we revisit hypernetwork-based methods and leverage their strong generalization capabilities to design a simple yet effective framework for heterogeneous personalized federated learning. Specifically, we propose MH-pFedHN, which leverages a server-side hypernetwork that takes client-specific embedding vectors as input and outputs personalized parameters tailored to each client's heterogeneous model. To promote knowledge sharing and reduce computation, we introduce a multi-head structure within the hypernetwork, allowing clients with similar model sizes to share heads. Furthermore, we further propose MH-pFedHNGD, which integrates an optional lightweight global model to improve generalization. Our framework does not rely on external datasets and does not require disclosure of client model architectures, thereby offering enhanced privacy and flexibility. Extensive experiments on multiple benchmarks and model settings demonstrate that our approach achieves competitive accuracy, strong generalization, and serves as a robust baseline for future research in model-heterogeneous personalized federated learning. Introduction Federated learning (FL) has been widely applied in various fields, such as intelligent transportation [1, 2] , healthcare [3, 4, 5] , and recommendation systems [6, 7, 8] . However, a single global model cannot meet all clients' needs due to non-IID data. To address this, personalized federated learning (pFL) [9, 10, 11] emerges, aiming to craft personalized models for clients while enabling knowledge sharing under cross-device settings [12] , thus better matching their specific tasks and data distributions. In practice, devices participating in pFL are often heterogeneous, as they usually have different computational resources [13, 14], communication capabilities [15, 16, 17], and model architectures [18, 19, 20] , which complicates the challenges that pFL faces in scenarios of model heterogeneity [21] . To address the limitations of the model heterogeneous pFL (MH-pFL), several methods have been proposed by researchers, including partial training [22, 23, 24, 25] , federated distillation [26, 27, 28, 29] and model decoupling [30, 31, 32, 33] . Preprint. Under review.