U-SWIM: Universal Selective Write-Verify for Computing-in-Memory Neural Accelerators (2401.05357v1)
Abstract: Architectures that incorporate Computing-in-Memory (CiM) using emerging non-volatile memory (NVM) devices have become strong contenders for deep neural network (DNN) acceleration due to their impressive energy efficiency. Yet, a significant challenge arises when using these emerging devices: they can show substantial variations during the weight-mapping process. This can severely impact DNN accuracy if not mitigated. A widely accepted remedy for imperfect weight mapping is the iterative write-verify approach, which involves verifying conductance values and adjusting devices if needed. In all existing publications, this procedure is applied to every individual device, resulting in a significant programming time overhead. In our research, we illustrate that only a small fraction of weights need this write-verify treatment for the corresponding devices and the DNN accuracy can be preserved, yielding a notable programming acceleration. Building on this, we introduce USWIM, a novel method based on the second derivative. It leverages a single iteration of forward and backpropagation to pinpoint the weights demanding write-verify. Through extensive tests on diverse DNN designs and datasets, USWIM manifests up to a 10x programming acceleration against the traditional exhaustive write-verify method, all while maintaining a similar accuracy level. Furthermore, compared to our earlier SWIM technique, USWIM excels, showing a 7x speedup when dealing with devices exhibiting non-uniform variations.
- Z. Liang, H. Wang, J. Cheng, Y. Ding, H. Ren, X. Qian, S. Han, W. Jiang, and Y. Shi, “Variational quantum pulse learning,” arXiv preprint arXiv:2203.17267, 2022.
- Z. Liang, Z. Wang, J. Yang, L. Yang, Y. Shi, and W. Jiang, “Can noise on qubits be learned in quantum neural network? a case study on quantumflow,” in 2021 IEEE/ACM International Conference On Computer Aided Design (ICCAD). IEEE, 2021, pp. 1–7.
- Z. Wang, Z. Liang, S. Zhou, C. Ding, Y. Shi, and W. Jiang, “Exploration of quantum neural architecture by mixing quantum neuron designs,” in 2021 IEEE/ACM International Conference On Computer Aided Design (ICCAD). IEEE, 2021, pp. 1–7.
- Y. Sheng, J. Yang, Y. Wu, K. Mao, Y. Shi, J. Hu, W. Jiang, and L. Yang, “The larger the fairer? small neural networks can achieve fairness for edge devices,” in Proceedings of the 59th ACM/IEEE Design Automation Conference, 2022, pp. 163–168.
- Y. Sheng, J. Yang, W. Jiang, and L. Yang, “Toward fair and efficient hyperdimensional computing,” in Proceedings of the 28th Asia and South Pacific Design Automation Conference, 2023, pp. 612–617.
- Y. Sheng, J. Yang, L. Yang, Y. Shi, J. Hu, and W. Jiang, “Muffin: A framework toward multi-dimension ai fairness by uniting off-the-shelf models,” arXiv preprint arXiv:2308.13730, 2023.
- A. Shafiee, A. Nag, N. Muralimanohar, R. Balasubramonian, J. P. Strachan, M. Hu, R. S. Williams, and V. Srikumar, “Isaac: A convolutional neural network accelerator with in-situ analog arithmetic in crossbars,” ACM SIGARCH Computer Architecture News, vol. 44, no. 3, pp. 14–26, 2016.
- W. Jiang, Q. Lou, Z. Yan, L. Yang, J. Hu, X. S. Hu, and Y. Shi, “Device-circuit-architecture co-exploration for computing-in-memory neural accelerators,” IEEE Transactions on Computers, vol. 70, no. 4, pp. 595–605, 2020.
- V. Sze, Y.-H. Chen, T.-J. Yang, and J. S. Emer, “Efficient processing of deep neural networks: A tutorial and survey,” Proceedings of the IEEE, vol. 105, no. 12, pp. 2295–2329, 2017.
- Y.-H. Chen, J. Emer, and V. Sze, “Eyeriss: A spatial architecture for energy-efficient dataflow for convolutional neural networks,” ACM SIGARCH computer architecture news, vol. 44, no. 3, pp. 367–379, 2016.
- L. Yang, Z. Yan, M. Li, H. Kwon, L. Lai, T. Krishna, V. Chandra, W. Jiang, and Y. Shi, “Co-exploration of neural architectures and heterogeneous asic accelerator designs targeting multiple tasks,” in 2020 57th ACM/IEEE Design Automation Conference (DAC). IEEE, 2020, pp. 1–6.
- Z. Yan, X. S. Hu, and Y. Shi, “Computing-in-memory neural network accelerators for safety-critical systems: Can small device variations be disastrous?” in Proceedings of the 41st IEEE/ACM International Conference on Computer-Aided Design, 2022, pp. 1–9.
- Z. Yan, Y. Shi, W. Liao, M. Hashimoto, X. Zhou, and C. Zhuo, “When single event upset meets deep neural networks: Observations, explorations, and remedies,” in 2020 25th Asia and South Pacific Design Automation Conference (ASP-DAC). IEEE, 2020, pp. 163–168.
- Z. Yan, D.-C. Juan, X. S. Hu, and Y. Shi, “Uncertainty modeling of emerging device based computing-in-memory neural accelerators with application to neural architecture search,” in 2021 26th Asia and South Pacific Design Automation Conference (ASP-DAC). IEEE, 2021, pp. 859–864.
- Z. Yan, W. Jiang, X. S. Hu, and Y. Shi, “Radars: memory efficient reinforcement learning aided differentiable neural architecture search,” in 2022 27th Asia and South Pacific Design Automation Conference (ASP-DAC). IEEE, 2022, pp. 128–133.
- P. Yao, H. Wu, B. Gao, J. Tang, Q. Zhang, W. Zhang, J. J. Yang, and H. Qian, “Fully hardware-implemented memristor convolutional neural network,” Nature, vol. 577, no. 7792, pp. 641–646, 2020.
- W. Shim, J.-s. Seo, and S. Yu, “Two-step write–verify scheme and impact of the read noise in multilevel rram-based inference engine,” Semiconductor Science and Technology, vol. 35, no. 11, p. 115026, 2020.
- Y. LeCun, J. Denker, and S. Solla, “Optimal brain damage,” Advances in neural information processing systems, vol. 2, 1989.
- Z. Yan, X. S. Hu, and Y. Shi, “Swim: Selective write-verify for computing-in-memory neural accelerators,” in 2022 59th ACM/IEEE Design Automation Conference (DAC). IEEE, 2022.
- B. Feinberg, S. Wang, and E. Ipek, “Making memristive neural network accelerators reliable,” in 2018 IEEE International Symposium on High Performance Computer Architecture (HPCA). IEEE, 2018, pp. 52–65.
- M. J. Rasch, C. Mackin, M. Le Gallo, A. Chen, A. Fasoli, F. Odermatt, N. Li, S. Nandakumar, P. Narayanan, H. Tsai et al., “Hardware-aware training for large-scale and diverse deep learning inference workloads using in-memory computing-based accelerators,” Nature Communications, vol. 14, no. 1, p. 5282, 2023.
- Z. Yan, Y. Qin, X. S. Hu, and Y. Shi, “Improving realistic worst-case performance of nvcim dnn accelerators through training with right-censored gaussian noise,” pp. 1–9, 2023.
- J. Büchel, A. Vasilopoulos, B. Kersting, F. Odermatt, K. Brew, I. Ok, S. Choi, I. Saraf, V. Chan, T. Philip et al., “Gradient descent-based programming of analog in-memory computing cores,” in 2022 International Electron Devices Meeting (IEDM). IEEE, 2022, pp. 33–1.
- C. Mackin, M. J. Rasch, A. Chen, J. Timcheck, R. L. Bruce, N. Li, P. Narayanan, S. Ambrogio, M. Le Gallo, S. Nandakumar et al., “Optimised weight programming for analogue memory-based deep neural networks,” Nature communications, vol. 13, no. 1, p. 3765, 2022.
- W. Wei, G. Zhao, X. Zhan, W. Zhang, P. Sang, Q. Wang, L. Tai, Q. Luo, Y. Li, C. Li et al., “Switching pathway-dependent strain-effects on the ferroelectric properties and structural deformations in orthorhombic hfo2,” Journal of Applied Physics, vol. 131, no. 15, 2022.
- Y. Liu, B. Gao, J. Tang, H. Wu, and H. Qian, “Architecture-circuit-technology co-optimization for resistive random access memory-based computation-in-memory chips,” Science China Information Sciences, vol. 66, no. 10, p. 200408, 2023.
- X. Peng, S. Huang, Y. Luo, X. Sun, and S. Yu, “Dnn+ neurosim: An end-to-end benchmarking framework for compute-in-memory accelerators with versatile device technologies,” in 2019 IEEE international electron devices meeting (IEDM). IEEE, 2019, pp. 32–5.
- K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2016, pp. 770–778.