FedML: A Research Library and Benchmark for Federated Machine Learning (2007.13518v4)

Published 27 Jul 2020 in cs.LG and stat.ML

Abstract: Federated learning (FL) is a rapidly growing research field in machine learning. However, existing FL libraries cannot adequately support diverse algorithmic development; inconsistent dataset and model usage make fair algorithm comparison challenging. In this work, we introduce FedML, an open research library and benchmark to facilitate FL algorithm development and fair performance comparison. FedML supports three computing paradigms: on-device training for edge devices, distributed computing, and single-machine simulation. FedML also promotes diverse algorithmic research with flexible and generic API design and comprehensive reference baseline implementations (optimizer, models, and datasets). We hope FedML could provide an efficient and reproducible means for developing and evaluating FL algorithms that would benefit the FL research community. We maintain the source code, documents, and user community at https://fedml.ai.

Authors (20)

Chaoyang He (46 papers)
Songze Li (73 papers)
Jinhyun So (11 papers)
Xiao Zeng (13 papers)
Mi Zhang (85 papers)
Hongyi Wang (62 papers)
Xiaoyang Wang (134 papers)
Praneeth Vepakomma (49 papers)
Abhishek Singh (71 papers)
Hang Qiu (17 papers)
Xinghua Zhu (6 papers)
Jianzong Wang (144 papers)
Li Shen (363 papers)
Peilin Zhao (127 papers)
Yan Kang (49 papers)
Yang Liu (2253 papers)
Ramesh Raskar (123 papers)
Qiang Yang (202 papers)
Murali Annavaram (42 papers)
Salman Avestimehr (116 papers)

Citations (509)

View on Semantic Scholar

Summary

The paper presents FedML, a comprehensive library that standardizes federated learning research by integrating multiple training paradigms.
It introduces a robust architecture that separates distributed communication from model training, supporting diverse federated configurations.
Experimental results demonstrate significant efficiency gains with multi-GPU distributed training, lowering training times for resource-intensive models.

Overview of FedML: A Research Library and Benchmark for Federated Machine Learning

The paper "FedML: A Research Library and Benchmark for Federated Machine Learning" presents a comprehensive framework aimed at advancing research in federated learning (FL). The authors introduce FedML as an open-source library designed to support a broad range of FL paradigms while addressing existing limitations in the field. This work provides valuable resources for algorithm development and evaluation, focusing on three key paradigms: on-device training, distributed computing, and single-machine simulation.

Challenges in Federated Learning

Federated learning is a distributed paradigm that facilitates model training on decentralized data residing across various devices. This introduces several unique challenges, notably statistical heterogeneity, system constraints, and issues of trustworthiness. The authors highlight that while decentralized optimization methods, communication-efficient techniques, and adversarial defense strategies have made strides in addressing some of these issues, there remain critical gaps particularly in the support for diverse FL configurations, standardized algorithm implementations, and benchmarks.

FedML Architectural Contributions

FedML introduces a robust architecture comprised of two main components: FedML-API and FedML-core. This architecture separates concerns of distributed communication from model training, supporting various topologies and enabling flexible message flows beyond typical gradient exchanges. The library also facilitates realistic system performance evaluation when deployed on real-world IoT and mobile platforms through subsets like FedML-Mobile and FedML-IoT. These allow researchers to test real-time system costs and training times across heterogeneous environments.

Supported Configurations and Standards

A significant aspect of FedML is its ability to accommodate diverse FL configurations, including but not limited to vertical FL, split learning, decentralized and hierarchical FL. This flexibility is achieved via a worker/client-oriented programming interface. This not only supports customization at granular worker levels but also allows for the implementation of current leading FL algorithms such as FedAvg, FedProx, and FedNAS.

FedML's provided benchmarks, which encompass linear models, shallow neural networks, and deep neural networks, offset the inconsistencies noted in previous research efforts by standardizing datasets, models, and partition methods. This standardization enables fair performance evaluations and offers a reliable platform for comparing newly developed algorithms.

Experimental Insights

The paper highlights experimental results showcasing the efficiency gains of FedML's distributed training over standalone simulation, particularly in training modern CNNs. It illustrates a noticeable reduction in training time with the use of multi-GPU distributed computing, underscoring the library's ability to support resource-intensive FL tasks.

Implications and Future Directions

The introduction of FedML holds substantial implications for both theoretical exploration and practical implementation within the FL domain. On a theoretical level, its support for varied computing paradigms invites exploration into optimization strategies and learning models that leverage federated data. Practically, through its facilitating of a standardized platform, it promotes reproducibility and fairness in performance comparisons. As the FL landscape continues to evolve, FedML’s utility is poised to expand, supporting the continuous integration of emerging algorithms and datasets contributed by the research community.

In conclusion, FedML stands as a robust and versatile tool that addresses key limitations in federated learning research, providing a unified framework that supports the diverse and evolving needs of the field. With its commitment to open-source development, it sets a strong foundation for future advancements and collaborations in federated machine learning.

PDF Markdown