Enhancing Neural Architecture Search with Multiple Hardware Constraints for Deep Learning Model Deployment on Tiny IoT Devices (2310.07217v1)
Abstract: The rapid proliferation of computing domains relying on Internet of Things (IoT) devices has created a pressing need for efficient and accurate deep-learning (DL) models that can run on low-power devices. However, traditional DL models tend to be too complex and computationally intensive for typical IoT end-nodes. To address this challenge, Neural Architecture Search (NAS) has emerged as a popular design automation technique for co-optimizing the accuracy and complexity of deep neural networks. Nevertheless, existing NAS techniques require many iterations to produce a network that adheres to specific hardware constraints, such as the maximum memory available on the hardware or the maximum latency allowed by the target application. In this work, we propose a novel approach to incorporate multiple constraints into so-called Differentiable NAS optimization methods, which allows the generation, in a single shot, of a model that respects user-defined constraints on both memory and latency in a time comparable to a single standard training. The proposed approach is evaluated on five IoT-relevant benchmarks, including the MLPerf Tiny suite and Tiny ImageNet, demonstrating that, with a single search, it is possible to reduce memory and latency by 87.4% and 54.2%, respectively (as defined by our targets), while ensuring non-inferior accuracy on state-of-the-art hand-tuned deep neural networks for TinyML.
- K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” in Proc. IEEE CVPR, 2016, pp. 770–778.
- Y. Zhang, N. Suda, L. Lai, and V. Chandra, “Hello edge: Keyword spotting on microcontrollers,” arXiv:1711.07128, 2017.
- A. Reiss, I. Indlekofer, P. Schmidt, and K. Van Laerhoven, “Deep PPG: large-scale heart rate estimation with convolutional neural networks,” Sensors, vol. 19, no. 14, p. 3079, 2019.
- Q. Wang, Y. Guo, L. Yu, and P. Li, “Earthquake prediction based on spatio-temporal data mining: An lstm network approach,” IEEE Transactions on Emerging Topics in Computing, vol. 8, no. 1, pp. 148–158, 2020.
- T. Cerquitelli, D. J. Pagliari, A. Calimera, L. Bottaccioli, E. Patti, A. Acquaviva, and M. Poncino, “Manufacturing as a data-driven practice: Methodologies, technologies, and tools,” Proceedings of the IEEE, vol. 109, no. 4, pp. 399–422, 2021.
- W. Shi, J. Cao, Q. Zhang, Y. Li, and L. Xu, “Edge computing: Vision and challenges,” IEEE Internet Things J., vol. 3, no. 5, pp. 637–646, Oct 2016.
- ST Microelectronics. STM32H7. [Online]. Available: https://www.st.com/en/microcontrollers-microprocessors/stm32h7-series.html
- O. Spantidi, G. Zervakis, S. Alsalamin, I. Roman-Ballesteros, J. Henkel, H. Amrouch, and I. Anagnostopoulos, “Targeting dnn inference via efficient utilization of heterogeneous precision dnn accelerators,” IEEE Transactions on Emerging Topics in Computing, vol. 11, no. 1, pp. 112–125, 2023.
- A. Burrello, A. Garofalo, N. Bruschi, G. Tagliavini, D. Rossi, and F. Conti, “Dory: Automatic end-to-end deployment of real-world dnns on low-cost iot mcus,” IEEE Transactions on Computers, vol. 70, no. 8, pp. 1253–1268, 2021.
- C. White, M. Safari, R. Sukthanker, B. Ru, T. Elsken, A. Zela, D. Dey, and F. Hutter, “Neural architecture search: Insights from 1000 papers,” arXiv preprint, arXiv:2301.08727, 2023.
- H. Cai, J. Lin, Y. Lin, Z. Liu, H. Tang, H. Wang, L. Zhu, and S. Han, “Enable deep learning on mobile devices: Methods, systems, and applications,” ACM Trans. Des. Autom. Electron. Syst., vol. 27, no. 3, mar 2022. [Online]. Available: https://doi.org/10.1145/3486618
- H. Benmeziane, K. El Maghraoui, H. Ouarnoughi, S. Niar, M. Wistuba, and N. Wang, “Hardware-aware neural architecture search: Survey and taxonomy,” in Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, Z.-H. Zhou, Ed. International Joint Conferences on Artificial Intelligence Organization, 8 2021, pp. 4322–4329, survey Track. [Online]. Available: https://doi.org/10.24963/ijcai.2021/592
- A. Howard, M. Sandler, G. Chu, L.-C. Chen, B. Chen, M. Tan, W. Wang, Y. Zhu, R. Pang, V. Vasudevan et al., “Searching for mobilenetv3,” in Proceedings of the IEEE/CVF international conference on computer vision, 2019, pp. 1314–1324.
- M. Tan and Q. Le, “Efficientnet: Rethinking model scaling for convolutional neural networks,” in ICML. PMLR, 2019, pp. 6105–6114.
- S. B. Song, J. W. Nam, and J. H. Kim, “NAS-PPG: PPG based heart rate estimation using neural architecture search,” IEEE Sensors Journal, vol. 21, no. 13, pp. 14 941– –14 949, 2021.
- M. Tan, B. Chen, R. Pang, V. Vasudevan, M. Sandler, A. Howard, and Q. V. Le, “Mnasnet: Platform-aware neural architecture search for mobile,” in Proc. IEEE CVPR, 2019, pp. 2820–2828.
- M. Risso, A. Burrello, F. Conti, L. Lamberti, Y. Chen, L. Benini, E. Macii, M. Poncino, and D. J. Pagliari, “Lightweight neural architecture search for temporal convolutional networks at the edge,” IEEE Transactions on Computers, vol. 72, no. 3, pp. 744–758, 2023.
- A. Wan, X. Dai, P. Zhang, Z. He, Y. Tian, S. Xie, B. Wu, M. Yu, T. Xu, K. Chen et al., “Fbnetv2: Differentiable neural architecture search for spatial and channel dimensions,” in Proc. IEEE/CVF CVPR, 2020, pp. 12 965–12 974.
- H. Cai, L. Zhu, and S. Han, “ProxylessNAS: Direct neural architecture search on target task and hardware,” in International Conference on Learning Representations, 2019.
- X. Luo, D. Liu, H. Kong, S. Huai, H. Chen, and W. Liu, “Lightnas: On lightweight and scalable neural architecture search for embedded platforms,” IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, pp. 1–1, 2022.
- N. Nayman, Y. Aflalo, A. Noy, and L. Zelnik, “HardCoRe-NAS: Hard Constrained diffeRentiable Neural Architecture Search,” in Proceedings of the 38th International Conference on Machine Learning. PMLR, Jul. 2021, pp. 7979–7990, iSSN: 2640-3498. [Online]. Available: https://proceedings.mlr.press/v139/nayman21a.html
- I. Fedorov, R. Matas, H. Tann, C. Zhou, M. Mattina, and P. Whatmough, “Udc: Unified dnas for compressible tinyml models for neural processing units,” Advances in Neural Information Processing Systems, vol. 35, pp. 18 456–18 471, 2022.
- M. Risso, A. Burrello, L. Benini, E. Macii, M. Poncino, and D. Jahier Pagliari, “Multi-complexity-loss dnas for energy-efficient and memory-constrained deep neural networks,” in Proceedings of the ACM/IEEE International Symposium on Low Power Electronics and Design, ser. ISLPED ’22. New York, NY, USA: Association for Computing Machinery, 2022.
- H. Liu, K. Simonyan, and Y. Yang, “Darts: Differentiable architecture search,” in International Conference on Learning Representations, 2018.
- C. Banbury, V. J. Reddi, P. Torelli, N. Jeffries, C. Kiraly, J. Holleman, P. Montino, D. Kanter, P. Warden, D. Pau et al., “Mlperf tiny benchmark,” in Thirty-fifth Conference on Neural Information Processing Systems Datasets and Benchmarks Track (Round 1), 2021.
- Y. Le and X. Yang, “Tiny imagenet visual recognition challenge,” CS 231N, vol. 7, no. 7, p. 3, 2015.
- Edge Impulse, “EON Tuner,” https://docs.edgeimpulse.com/docs/edge-impulse-studio/eon-tuner, Accessed on April 26th, 2023.
- TDK Qeexo, “Deep Learning in Qeexo AutoML Platform,” https://qeexo.tdk.com/deep-learning-in-qeexo-automl-platform/, Accessed on April 26th, 2023.
- B. Zoph and Q. Le, “Neural architecture search with reinforcement learning,” in International Conference on Learning Representations, 2016.
- A. Gordon, E. Eban, O. Nachum, B. Chen, H. Wu, T.-J. Yang, and E. Choi, “Morphnet: Fast & simple resource-constrained structure learning of deep networks,” in Proc. IEEE CVPR, 2018, pp. 1586–1595.
- E. Real, S. Moore, A. Selle, S. Saxena, Y. L. Suematsu, J. Tan, Q. V. Le, and A. Kurakin, “Large-scale evolution of image classifiers,” in Proc. ICML. PMLR, 2017, pp. 2902–2911.
- T. Elsken, J. H. Metzen, and F. Hutter, “Neural architecture search: A survey,” The Journal of Machine Learning Research, vol. 20, no. 1, pp. 1997–2017, 2019.
- T. Elsken, J. Hendrik Metzen, and F. Hutter, “Efficient multi-objective neural architecture search via lamarckian evolution,” in 7th International Conference on Learning Representations, ICLR 2019, New Orleans, LA, USA, May 6-9, 2019, 2019.
- Z. Lu, I. Whalen, V. Boddeti, Y. Dhebar, K. Deb, E. Goodman, and W. Banzhaf, “Nsga-net: Neural architecture search using multi-objective genetic algorithm,” in Proceedings of the Genetic and Evolutionary Computation Conference, ser. GECCO ’19. New York, NY, USA: Association for Computing Machinery, 2019, p. 419–427. [Online]. Available: https://doi.org/10.1145/3321707.3321729
- E. Liberis, Ł. Dudziak, and N. D. Lane, “μ𝜇\muitalic_μNAS: Constrained Neural Architecture Search for Microcontrollers,” in Proceedings of the 1st Workshop on Machine Learning and Systems. Online United Kingdom: ACM, Apr. 2021, pp. 70–79. [Online]. Available: https://dl.acm.org/doi/10.1145/3437984.3458836
- M. Gambella, A. Falcetta, and M. Roveri, “CNAS: Constrained Neural Architecture Search,” in 2022 IEEE International Conference on Systems, Man, and Cybernetics (SMC), Oct. 2022, pp. 2918–2923, iSSN: 2577-1655.
- G. Bender, H. Liu, B. Chen, G. Chu, S. Cheng, P.-J. Kindermans, and Q. V. Le, “Can weight sharing outperform random architecture search? an investigation with tunas,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020, pp. 14 323–14 332.
- R. Perego, A. Candelieri, F. Archetti, and D. Pau, “AutoTinyML for microcontrollers: Dealing with black-box deployability,” Expert Systems with Applications, vol. 207, p. 117876, Nov. 2022. [Online]. Available: https://linkinghub.elsevier.com/retrieve/pii/S0957417422011289
- S. Xie, H. Zheng, C. Liu, and L. Lin, “Snas: stochastic neural architecture search,” in International Conference on Learning Representations, 2018.
- E. Jang, S. Gu, and B. Poole, “Categorical reparameterization with gumbel-softmax,” in International Conference on Learning Representations, 2017.
- M. Courbariaux, Y. Bengio, and J.-P. David, “Binaryconnect: Training deep neural networks with binary weights during propagations,” Adv. Neural Inf. Process. Syst., vol. 28, 2015.
- A. G. Howard, M. Zhu, B. Chen, D. Kalenichenko, W. Wang, T. Weyand, M. Andreetto, and H. Adam, “Mobilenets: Efficient convolutional neural networks for mobile vision applications,” arXiv:1704.04861, 2017.
- G. Bender, P.-J. Kindermans, B. Zoph, V. Vasudevan, and Q. Le, “Understanding and simplifying one-shot architecture search,” in Proceedings of the 35th International Conference on Machine Learning, ser. Proceedings of Machine Learning Research, J. Dy and A. Krause, Eds., vol. 80. PMLR, 10–15 Jul 2018, pp. 550–559.
- N. Shazeer, A. Mirhoseini, K. Maziarz, A. Davis, Q. Le, G. Hinton, and J. Dean, “Outrageously Large Neural Networks: The Sparsely-Gated Mixture-of-Experts Layer,” in International Conference on Learning Representations, 2017.
- A. Krizhevsky, G. Hinton et al. (2009) Cifar-10. [Online]. Available: http://www.cs.toronto.edu/~kriz/cifar.html
- A. Chowdhery, P. Warden, J. Shlens, A. Howard, and R. Rhodes, “Visual wake words dataset,” 2019.
- P. Warden, “Speech commands: A dataset for limited-vocabulary speech recognition,” arXiv:1804.03209, 2018.
- Y. Koizumi, Y. Kawaguchi, K. Imoto, T. Nakamura, Y. Nikaido, R. Tanabe, H. Purohit, K. Suefusa, T. Endo, M. Yasuda, and N. Harada, “Description and discussion on dcase2020 challenge task2: Unsupervised anomalous sound detection for machine condition monitoring,” 2020.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.
 
          