LLMs as On-demand Customizable Service (2401.16577v1)
Abstract: LLMs have demonstrated remarkable language understanding and generation capabilities. However, training, deploying, and accessing these models pose notable challenges, including resource-intensive demands, extended training durations, and scalability issues. To address these issues, we introduce a concept of hierarchical, distributed LLM architecture that aims at enhancing the accessibility and deployability of LLMs across heterogeneous computing platforms, including general-purpose computers (e.g., laptops) and IoT-style devices (e.g., embedded systems). By introducing a "layered" approach, the proposed architecture enables on-demand accessibility to LLMs as a customizable service. This approach also ensures optimal trade-offs between the available computational resources and the user's application needs. We envision that the concept of hierarchical LLM will empower extensive, crowd-sourced user bases to harness the capabilities of LLMs, thereby fostering advancements in AI technology in general.
- T. Brown, B. Mann, N. Ryder, M. Subbiah, J. D. Kaplan, P. Dhariwal, A. Neelakantan, P. Shyam, G. Sastry, A. Askell et al., “Language models are few-shot learners,” Advances in neural information processing systems, vol. 33, pp. 1877–1901, 2020.
- H. Touvron, T. Lavril, G. Izacard, X. Martinet, M.-A. Lachaux, T. Lacroix, B. Rozière, N. Goyal, E. Hambro, F. Azhar, A. Rodriguez, A. Joulin, E. Grave, and G. Lample, “Llama: Open and efficient foundation language models,” 2023.
- T. L. Scao, A. Fan, C. Akiki, E. Pavlick, S. Ilić, D. Hesslow, R. Castagné, A. S. Luccioni, F. Yvon, M. Gallé et al., “Bloom: A 176b-parameter open-access multilingual language model,” arXiv preprint arXiv:2211.05100, 2022.
- T. Zhang, F. Ladhak, E. Durmus, P. Liang, K. McKeown, and T. B. Hashimoto, “Benchmarking large language models for news summarization,” 2023.
- M. Sallam, “Chatgpt utility in healthcare education, research, and practice: systematic review on the promising perspectives and valid concerns,” in Healthcare, vol. 11. MDPI, 2023, p. 887.
- N. M. S. Surameery and M. Y. Shakor, “Use chat gpt to solve programming bugs,” International Journal of Information Technology & Computer Engineering (IJITC) ISSN: 2455-5290, vol. 3, no. 01, pp. 17–22, 2023.
- S. Sarkar, D. Feng, and S. K. K. Santu, “Exploring universal sentence encoders for zero-shot text classification,” in Proceedings of the 2nd Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 12th International Joint Conference on Natural Language Processing, 2022, pp. 135–147.
- ——, “Zero-shot multi-label topic inference with sentence encoders and llms,” in Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, 2023, pp. 16 218–16 233.
- W. Jiao, W. Wang, J.-t. Huang, X. Wang, and Z. Tu, “Is chatgpt a good translator? a preliminary study,” arXiv preprint arXiv:2301.08745, 2023.
- S. Sarkar, M. F. Babar, M. M. Hassan, M. Hasan, and S. K. K. Santu, “Exploring challenges of deploying bert-based nlp models in resource-constrained embedded devices,” arXiv preprint arXiv:2304.11520, 2023.
- F.-K. Sun, C.-H. Ho, and H.-Y. Lee, “Lamol: Language modeling for lifelong language learning,” arXiv preprint arXiv:1909.03329, 2019.
- J. Gou, B. Yu, S. J. Maybank, and D. Tao, “Knowledge distillation: A survey,” International Journal of Computer Vision, vol. 129, pp. 1789–1819, 2021.
- S. K. Karmaker, M. M. Hassan, M. J. Smith, L. Xu, C. Zhai, and K. Veeramachaneni, “Automl to date and beyond: Challenges and opportunities,” ACM Computing Surveys (CSUR), vol. 54, no. 8, pp. 1–36, 2021.
- X. Zhu, J. Wang, Z. Hong, and J. Xiao, “Empirical studies of institutional federated learning for natural language processing,” in Findings of the Association for Computational Linguistics: EMNLP 2020, 2020, pp. 625–634.
- M. Chen, A. T. Suresh, R. Mathews, A. Wong, C. Allauzen, F. Beaufays, and M. Riley, “Federated learning of n-gram language models,” arXiv preprint arXiv:1910.03432, 2019.
- X. Li, Y. Zhou, T. Wu, R. Socher, and C. Xiong, “Learn to grow: A continual structure learning framework for overcoming catastrophic forgetting,” in International Conference on Machine Learning. PMLR, 2019, pp. 3925–3934.
- Z. Ke, Y. Shao, H. Lin, T. Konishi, G. Kim, and B. Liu, “Continual pre-training of language models,” in The Eleventh International Conference on Learning Representations, 2023.
- V. Tolpegin, S. Truex, M. E. Gursoy, and L. Liu, “Data poisoning attacks against federated learning systems,” in Computer Security–ESORICS 2020: 25th European Symposium on Research in Computer Security, ESORICS 2020, Guildford, UK, September 14–18, 2020, Proceedings, Part I 25. Springer, 2020, pp. 480–501.
- X. Zhou, M. Xu, Y. Wu, and N. Zheng, “Deep model poisoning attack on federated learning,” Future Internet, vol. 13, no. 3, p. 73, 2021.
- J. Lin, M. Du, and J. Liu, “Free-riders in federated learning: Attacks and defenses,” arXiv preprint arXiv:1911.12560, 2019.