An Edge-Cloud Collaboration Framework for Generative AI Service Provision with Synergetic Big Cloud Model and Small Edge Models (2401.01666v1)
Abstract: Generative artificial intelligence (GenAI) offers various services to users through content creation, which is believed to be one of the most important components in future networks. However, training and deploying big artificial intelligence models (BAIMs) introduces substantial computational and communication overhead.This poses a critical challenge to centralized approaches, due to the need of high-performance computing infrastructure and the reliability, secrecy and timeliness issues in long-distance access of cloud services. Therefore, there is an urging need to decentralize the services, partly moving them from the cloud to the edge and establishing native GenAI services to enable private, timely, and personalized experiences. In this paper, we propose a brand-new bottom-up BAIM architecture with synergetic big cloud model and small edge models, and design a distributed training framework and a task-oriented deployment scheme for efficient provision of native GenAI services. The proposed framework can facilitate collaborative intelligence, enhance adaptability, gather edge knowledge and alleviate edge-cloud burden. The effectiveness of the proposed framework is demonstrated through an image generation use case. Finally, we outline fundamental research directions to fully exploit the collaborative potential of edge and cloud for native GenAI and BAIM applications.
- M. Xu, H. Du and D. Niyato “Unleashing the power of edge-cloud generative AI in mobile networks: A survey of AIGC services” In arXiv preprint arXiv:2303.16129, 2023
- “Foundation Model Based Native AI Framework in 6G with Cloud-Edge-End Collaboration” In arXiv preprint arXiv:2310.17471, 2023
- Z. Chen, Z. Zhang and Z. Yang “Big AI models for 6G wireless networks: Opportunities, challenges, and research directions” In arXiv preprint arXiv:2308.06250, 2023
- “Toward Self-Learning Edge Intelligence in 6G” In IEEE Communications Magazine 58.12, 2020, pp. 34–40 DOI: 10.1109/MCOM.001.2000388
- X. Lin “An Overview of the 3GPP Study on Artificial Intelligence for 5G New Radio” In arXiv preprint arXiv:2308.05315, 2023
- “Knowledge Distillation of Large Language Models” In arXiv preprint arXiv:2306.08543, 2023
- M. Chen, D. Gündüz and K. Huang “Distributed Learning in Wireless Networks: Recent Progress and Future Challenges” In IEEE Journal on Selected Areas in Communications 39.12, 2021, pp. 3579–3605 DOI: 10.1109/JSAC.2021.3118346
- W. Xu, Z. Yang and D.W.K. Ng “Edge Learning for B5G Networks With Distributed Signal Processing: Semantic Communication, Edge Computing, and Wireless Sensing” In IEEE Journal of Selected Topics in Signal Processing 17.1, 2023, pp. 9–39 DOI: 10.1109/JSTSP.2023.3239189
- “JMSNAS: Joint model split and neural architecture search for learning over mobile edge networks” In 2022 IEEE International Conference on Communications Workshops, 2022, pp. 103–108 IEEE
- Y. Yang, Z. Zhang and Y. Tian “Over-the-Air Split Machine Learning in Wireless MIMO Networks” In IEEE Journal on Selected Areas in Communications 41.4, 2023, pp. 1007–1022 DOI: 10.1109/JSAC.2023.3242701
- “MocoSFL: enabling cross-client collaborative self-supervised learning” In The Eleventh International Conference on Learning Representations, 2022
- Google “Introducing Pathways: A next-generation AI architecture” https://blog.google/technology/ai/introducing-pathways-next-generation-ai-architecture/.
- B. Mustafa, C. Riquelme and J. Puigcerver “Multimodal contrastive learning with LiMoE: the language-image mixture of experts” In Advances in Neural Information Processing Systems 35, 2022, pp. 9564–9576
- “Mod-Squad: Designing Mixtures of Experts As Modular Multi-Task Learners” In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023, pp. 11828–11837
- “Multimodality helps unimodality: Cross-modal few-shot learning with multimodal models” In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023, pp. 19325–19337