From Distributed Machine Learning to Federated Learning: A Survey (2104.14362v4)

Published 29 Apr 2021 in cs.DC, cs.AI, and cs.LG

Abstract: In recent years, data and computing resources are typically distributed in the devices of end users, various regions or organizations. Because of laws or regulations, the distributed data and computing resources cannot be directly shared among different regions or organizations for machine learning tasks. Federated learning emerges as an efficient approach to exploit distributed data and computing resources, so as to collaboratively train machine learning models, while obeying the laws and regulations and ensuring data security and data privacy. In this paper, we provide a comprehensive survey of existing works for federated learning. We propose a functional architecture of federated learning systems and a taxonomy of related techniques. Furthermore, we present the distributed training, data communication, and security of FL systems. Finally, we analyze their limitations and propose future research directions.

Citations (196)

View on Semantic Scholar

Summary

The paper presents a comprehensive survey detailing the architecture and innovations of federated learning systems.
It analyzes parallelism, aggregation algorithms, and data security techniques to optimize collaborative model training.
The study outlines future research directions to improve benchmarks, interpretability, and robustness in distributed learning.

Comprehensive Survey on Distributed Machine Learning and Federated Learning

Introduction to Federated Learning

This paper presents a detailed examination of federated learning (FL), a distributed machine learning approach designed to leverage decentralized data and computing resources for training machine learning models. Federated learning has emerged as a response to legal and privacy constraints that hinder traditional centralized data aggregation approaches, such as those imposed by GDPR and other regional data protection regulations. The fundamental innovation FL brings is its ability to train models collaboratively across multiple entities without the necessity of pooling raw data onto a central server, thus preserving data privacy and security.

Functional Architecture and Concepts

The authors propose a multi-layer architectural framework for FL systems, delineating the layers into presentation, user services, FL training, and infrastructure. This framework serves as a conceptual baseline for understanding and analyzing FL systems. They also provide a comprehensive breakdown of various parallelism techniques utilized within FL, highlighting data, model, and pipeline parallelism, alongside the corresponding FL types: horizontal (data parallelism), vertical (model parallelism), and hybrid (transfer learning).

Federated Learning Systems

The paper presents a taxonomy of FL-related techniques, including:

Parallelism Techniques: Describing how different parallelism strategies are leveraged in federated scenarios to optimize model training processes.
Aggregation Algorithms: Offering a comparative analysis of centralized, hierarchical, and decentralized aggregation algorithms, such as FedAvg, FedBCD, and SCAFFOLD, and discussing their applicability to diverse FL implementations.
Data Security Approaches: Covering various strategies to ensure data privacy during federated learning, such as differential privacy (DP), homomorphic encryption (HE), and techniques to mitigate generative adversarial network (GAN) attacks.
Data Transfer and Execution: Discussing data compression strategies like subsampling and quantization to improve communication efficiency in federated learning systems. The role of RPC frameworks in managing distributed execution and fault tolerance is also examined.

Federated Learning Frameworks

The paper introduces several widely adopted FL frameworks, including PaddleFL, TensorFlow Federated, FATE, and PySyft. These frameworks provide essential tools and interfaces for implementing FL across various domains, highlighting their support for different types of FL and data security implementations.

Discussion of Limitations and Research Directions

The discussion identifies several research directions that remain open within the domain of federated learning:

Improvement of Benchmarks: The need for diverse datasets supporting vertical and hybrid federated learning scenarios.
Interpretability Challenges: Addressing the difficulty in understanding the model outcomes and contributions of individual clients in FL environments.
Decentralized and Robust Aggregation: Exploring alternative topologies and robust aggregation strategies to improve efficiency and security against adversarial attacks.
Federated Learning for Graph Structures: Suggesting further exploration into federated learning frameworks for graph neural networks and multi-party collaboration.

Conclusion

The paper concludes with insights into the potential for federated learning systems to transform machine learning applications by ensuring privacy and security, and motivating further development in addressing current limitations. Future research directions include developing tailored benchmarks, enhancing interpretability, expanding decentralized aggregation techniques, and applying FL to graph-based and imbalanced learning tasks.

This report provides experienced researchers with a thorough understanding of the current state of federated learning and the technological advancements required to address existing gaps and maximize the potential of distributed machine learning systems.

PDF Markdown

From Distributed Machine Learning to Federated Learning: A Survey (2104.14362v4)

Summary

Comprehensive Survey on Distributed Machine Learning and Federated Learning

Related Papers