AAAI2026

CoRe-Fed: Bridging Collaborative and Representation Fairness via Federated Embedding Distillation

Noorain Mukhtiar, Adnan Mahmood, Quan Z. Sheng

Abstract

With the proliferation of distributed data sources, Federated Learning (FL) has emerged as a key approach to enable collaborative intelligence through decentralized model training while preserving data privacy. However, conventional FL algorithms often suffer from performance disparities across clients caused by heterogeneous data distributions and unequal participation, which leads to unfair outcomes. Specifically, we focus on two core fairness challenges, i.e., representation bias, arising from misaligned client representations, and collaborative bias, stemming from inequitable contribution during aggregation, both of which degrade model performance and generalizability. To mitigate these disparities, we propose CoRe-Fed, a unified optimization framework that bridges collaborative and representation fairness via embedding-level regularization and fairness-aware aggregation. Initially, an alignment-driven mechanism promotes semantic consistency between local and global embeddings to reduce representational divergence. Subsequently, a dynamic reward-penalty-based aggregation strategy adjusts each client's weight based on participation history and embedding alignment to ensure contribution-aware aggregation. Extensive experiments across diverse models and datasets demonstrate that CoRe-Fed improves both fairness and model performance over the state-of-the-art baseline algorithms. Code - https://github.com/Noorain1/CoRe-Fed Introduction Federated Learning (FL) has gained widespread adoption as a decentralized learning framework that enables multiple devices, sensors, or edge nodes (collectively referred to as clients or participants) to collaboratively train Machine Learning (ML) models without directly sharing their raw data, thus preserving data privacy (Woisetschläger et al. 2024) . However, despite its growing popularity, traditional FL encounters several challenges, particularly, in ensuring fair and unbiased model performance across clients with heterogeneous data distribution and varying participation frequency. These discrepancies result in models that perform disproportionately well on certain clients, while neglecting others, a phenomenon referred to as performance bias.