FedDAR: Federated Domain-Aware Representation Learning

Cross-silo Federated learning (FL) has become a promising tool in machine learning applications for healthcare. It allows hospitals/institutions to train models with sufficient data while the data is kept private. To make sure the FL model is robust when facing heterogeneous data among FL clients, most efforts focus on personalizing models for clients. However, the latent relationships between clients’ data are ignored. FedDAR focuses on a special non-i.i.d. FL problem, called Domain-mixed FL, where each client’s data distribution is assumed to be a mixture of several predefined domains. Recognizing the diversity of domains and the similarity within domains, FedDAR learns a domain shared representation and domain-wise personalized prediction heads in a decoupled manner. For simplified linear regression settings, we have theoretically proved that FedDAR enjoys a linear convergence rate. For general settings, we have performed intensive empirical studies on both synthetic and real-world medical datasets which demonstrate its superiority over prior FL methods.

Performance under different number of training samples per client (100 clients, 5 domains), showing improvements over the commonly-used FL algorithms of FedSGD and FedRep.

Related publications:
FedDAR: Federated Domain-Aware Representation Learning, https://arxiv.org/abs/2209.04007