Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
194 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
46 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Approximate Wireless Communication for Lossy Gradient Updates in IoT Federated Learning (2404.11035v1)

Published 17 Apr 2024 in cs.IT, cs.DC, cs.NI, and math.IT

Abstract: Federated learning (FL) has emerged as a distributed ML technique that can protect local data privacy for participating clients and improve system efficiency. Instead of sharing raw data, FL exchanges intermediate learning parameters, such as gradients, among clients. This article presents an efficient wireless communication approach tailored for FL parameter transmission, especially for Internet of Things (IoT) devices, to facilitate model aggregation. Our study considers practical wireless channels that can lead to random bit errors, which can substantially affect FL performance. Motivated by empirical gradient value distribution, we introduce a novel received bit masking method that confines received gradient values within prescribed limits. Moreover, given the intrinsic error resilience of ML gradients, our approach enables the delivery of approximate gradient values with errors without resorting to extensive error correction coding or retransmission. This strategy reduces computational overhead at both the transmitter and the receiver and minimizes communication latency. Consequently, our scheme is particularly well-suited for resource-constrained IoT devices. Additionally, we explore the inherent protection of the most significant bits (MSBs) through gray coding in high-order modulation. Our simulations demonstrate that our proposed scheme can effectively mitigate random bit errors in FL performance, achieving similar learning objectives, but with the 50% air time required by existing methods involving error correction and retransmission.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (38)
  1. J. Konečný, H. B. McMahan, F. X. Yu, P. Richtárik, A. T. Suresh, and D. Bacon, “Federated learning: Strategies for improving communication efficiency,” [Online]: https://arxiv.org/abs/1610.05492, 2016.
  2. B. Brik, A. Ksentini, and M. Bouaziz, “Federated learning for UAVs-enabled wireless networks: Use cases, challenges, and open problems,” IEEE Access, vol. 8, pp. 53841–53849, Mar. 2020.
  3. A. Nafaa, T. Taleb, and L. Murphy, “Forward error correction strategies for media streaming over wireless networks,” IEEE Commun. Mag., vo. 46, no. 1, pp. 72-79, 2008.
  4. M. F. Tsai, N. Chilamkurti, C. K. Shieh, and A. Vinel, “Mac-level forward error correction mechanism for minimum error recovery overhead and retransmission.” Mathematical and Computer Modeling, vol. 53, no. 11-12, pp. 2067-2077, 2011.
  5. I. Ez-Zazi, M. Arioua, El Oualkadi, and Y. el. Assari, “Joint FEC/CRC coding scheme for energy constrained IoT devices,” Proceedings of the International Conference on Future Networks and Distributed Systemsm pp. 1-8, 2017.
  6. M. Shirvanimoghaddam, A. Salari, Y. Gao, and A. Guha, “Federated learning with erroneous communication links,” IEEE Commun. Lett., vol. 26, no. 6, pp. 1293-1297, Apr. 2022.
  7. X. Su, Y. Zhou, L. Cui, and J. Liu, “On model transmission strategies in federated learning with lossy communications,” IEEE Trans. Parallel Distrib. Syst., vol. 34, no. 4, pp. 1172-1185, 2023.
  8. H. B. McMahan, E. Moore, D. Ramage, S. Hampson, and H. A. Arcas, “Communication-efficient learning of deep networks from decentralized data,” [Online]: https://arxiv.org/abs/1602.05629, 2016.
  9. M. Chen, H.V. Poor, W. Saad, and S. Cui, “Convergence time optimization for federated learning over wireless networks,” IEEE Trans. Wireless Commun., vol. 20, no. 4, pp. 2457-2472, 2020.
  10. X. Ma, H. Sun, Q. Wang, R.Q. Hu, “Scheduling policy and power allocation for federated learning in NOMA based MEC,” in IEEE Globecom, pp. 1-7, 2020.
  11. H. Sun, X. Ma, R. Q. Hu, “Adaptive federated learning with gradient compression in uplink NOMA,” IEEE Trans. Veh. Technol., vol. 69, no. 12, pp. 16325-16329, 2020.
  12. F. Seide, H. Fu, J. Droppo, G. Li and D. Yu, “1-bit stochastic gradient descent and its application to data-parallel distributed training of speech dnns,” in Fifteenth Annual Conference of the International Speech Communication Association, 2014.
  13. A. F. Aji, and K. Heafield, “Sparse communication for distributed gradient descent,” [Online]: https://arxiv.org/abs/1704.05021, 2017.
  14. Z. Yang, M. Chen, W. Saad, C.S. Hong and M. Shikh-Bahaei, “Energy-efficient federated learning over wireless communication networks,” IEEE Trans. Wireless Commun., vol 20, no. 3, pp. 1935-1949, 2020.
  15. F. Betzel, K. Khatamifard, H. Suresh, D. J. Lilja, J. Sartori, U. Karpuzcu, “Approximate communication: Techniques for reducing communication bottlenecks in large-scale parallel systems.” ACM Comput. Surv., vol. 51, no. 1, pp. 1-32, Jan. 2018.
  16. W. Wen, C. Xu, F. Yan, C. Wu, Y. Wang, Y. Chen, and H. Li, “Terngrad: Ternary gradients to reduce communication in distributed deep learning,” in Advances in Neural Information Processing Systems, vol. 30, 2017.
  17. A. M. Abdelmoniem, A. Elzanaty, M. S. Alouini, and M. Canini, “An efficient statistical-based gradient compression technique for distributed training systems,” in Proceedings of Machine Learning and Systems, vol. 3, pp. 297-322, 2021.
  18. B. Guo, Y. Liu, and C. Zhang, “A partition based gradient compression algorithm for distributed training in aiot,” in Sensors, vol. 21, no. 6, pp. 1943, 2021.
  19. F. Sunny, A. Mirza, I. Thakkar, M. Nikdast, and S. Pasricha, “ARXON: A framework for approximate communication over photonic networks-on-chip,” [Online]: https://arxiv.org/abs/2103.08828, 2021.
  20. Y. Chen, and A. Louri, “An approximate communication framework for network-on-chips,” IEEE Trans. Parallel Distrib. Syst., vol. 31, no.6, pp. 1434-1446, 2020.
  21. M. F. Reza, and P. Ampadu, “Approximate communication strategies for energy-efficient and high performance NoC: Opportunities and challenges,” Great Lakes Symposium on VLSI 2019, May 2019.
  22. S. Sen, S. Gilani, S. Srinath, S. Schmitt, and S. Banerjee, “Design and implementation of an “Approximate” communication system for wireless media applications,” Proceedings of the ACM SIGCOMM 2010 Conference, pp.15-26, 2010.
  23. O. Shamir, N. Srebro, and T. Zhang, “Communication-efficient distributed optimization using an approximate newton-type method,” Proceedings of the 31st International Conference on Machine Learning, vol. 32, no. 2, pp. 1000-1008, PMLR, 2014.
  24. R. Y. Choi, A. S. Coyner, J. Kalpathy-Cramer, M. F. Chiang, and J. P. Campbell, “Introduction to machine learning, neural networks, and deep learning,” Translational Vision Science & Technology, vol. 9, no. 2, Jan. 2020.
  25. M. A. Nielsen, “Neural networks and deep learning”, Determination Press, 2015.
  26. C. Nwankpa, W. Ijomah, A. Gachagan, and S. Marshall, “Activation functions: Comparison of trends in practice and research for deep learning,” [Online]: https://arxiv.org/abs/1811.03378, 2018.
  27. I. Goodfellow, Y. Bengio, and A. Courville, “Deep learning,” MIT Press, 2016.
  28. I. Sutskever, J. Martens, G. Dahl, G. Hinton, “On the importance of initialization and momentum in deep learning,” in International Conference on Machine Learning, pp. 1139-1147, PMLR, 2013.
  29. X. Glorot, and Y. Bengio, “Understanding the difficulty of training deep feedforward neural networks,” in Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics, pp. 249-256, 2010.
  30. K. He, X. Zhang, S. Ren, and J. Sun, “Delving deep into rectifiers: Surpassing human-level performance on imagenet classification,” in Proceedings of the IEEE International Conference on Computer Vision, pp. 1026-1034, 2015.
  31. K, O’Shea, and R. Nash, “An introduction to convolutional neural networks,” [Online]: https://arxiv.org/abs/1511.08458, 2015.
  32. P. Murugan, “Feed forward and backward run in deep convolution neural network,” [Online]: https://arxiv.org/abs/1711.03278, 2017.
  33. G. Philipp, D. Song, J. G. Carbonell, “The exploding gradient problem demystified - definition, prevalence, impact, origin, tradeoffs, and solutions,” [Online]: https://arxiv.org/abs/1712.05577, 2018.
  34. H. H. Tan, and K. H. Lim, “Vanishing gradient mitigation with deep learning neural network optimization,” 2019 7th International Conference on Smart Computing & Communications (ICSCC), pp. 1-4. IEEE, 2019.
  35. Y. Chen, and A. Louri, “Learning-based quality management for approximate communication in network-on-chips,” IEEE Trans. Computer-Aided Design Integr. Circuits Syst., vol. 39, no. 11, pp. 3724-3735, 2020.
  36. M. Duarte, A. Sabharwal, V. Aggarwal, R. Jana, K.K. Ramakrishnan, C.W. Rice, and N.K. Shankaranarayanan, “Design and characterization of a full-duplex multiantenna system for WiFi networks,” IEEE Trans. Vehi. Techno., vol. 63, no. 3, pp. 1160-1177, 2013.
  37. B. K. Butler, “Minimum distances of the QC-LDPC codes in IEEE 802 communication standards,” [Online]: https://arxiv.org/abs/1602.02831, 2016.
  38. S. Varma, “Chapter 4 - Congestion control in broadband wireless networks,” Internet Congestion Control, ISBN 9780128035832, pp. 103-134, https://doi.org/10.1016/B978-0-12-803583-2.00004-9.

Summary

We haven't generated a summary for this paper yet.