Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
158 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

U-SWIM: Universal Selective Write-Verify for Computing-in-Memory Neural Accelerators (2401.05357v1)

Published 11 Dec 2023 in cs.AR and cs.LG

Abstract: Architectures that incorporate Computing-in-Memory (CiM) using emerging non-volatile memory (NVM) devices have become strong contenders for deep neural network (DNN) acceleration due to their impressive energy efficiency. Yet, a significant challenge arises when using these emerging devices: they can show substantial variations during the weight-mapping process. This can severely impact DNN accuracy if not mitigated. A widely accepted remedy for imperfect weight mapping is the iterative write-verify approach, which involves verifying conductance values and adjusting devices if needed. In all existing publications, this procedure is applied to every individual device, resulting in a significant programming time overhead. In our research, we illustrate that only a small fraction of weights need this write-verify treatment for the corresponding devices and the DNN accuracy can be preserved, yielding a notable programming acceleration. Building on this, we introduce USWIM, a novel method based on the second derivative. It leverages a single iteration of forward and backpropagation to pinpoint the weights demanding write-verify. Through extensive tests on diverse DNN designs and datasets, USWIM manifests up to a 10x programming acceleration against the traditional exhaustive write-verify method, all while maintaining a similar accuracy level. Furthermore, compared to our earlier SWIM technique, USWIM excels, showing a 7x speedup when dealing with devices exhibiting non-uniform variations.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (28)
  1. Z. Liang, H. Wang, J. Cheng, Y. Ding, H. Ren, X. Qian, S. Han, W. Jiang, and Y. Shi, “Variational quantum pulse learning,” arXiv preprint arXiv:2203.17267, 2022.
  2. Z. Liang, Z. Wang, J. Yang, L. Yang, Y. Shi, and W. Jiang, “Can noise on qubits be learned in quantum neural network? a case study on quantumflow,” in 2021 IEEE/ACM International Conference On Computer Aided Design (ICCAD).   IEEE, 2021, pp. 1–7.
  3. Z. Wang, Z. Liang, S. Zhou, C. Ding, Y. Shi, and W. Jiang, “Exploration of quantum neural architecture by mixing quantum neuron designs,” in 2021 IEEE/ACM International Conference On Computer Aided Design (ICCAD).   IEEE, 2021, pp. 1–7.
  4. Y. Sheng, J. Yang, Y. Wu, K. Mao, Y. Shi, J. Hu, W. Jiang, and L. Yang, “The larger the fairer? small neural networks can achieve fairness for edge devices,” in Proceedings of the 59th ACM/IEEE Design Automation Conference, 2022, pp. 163–168.
  5. Y. Sheng, J. Yang, W. Jiang, and L. Yang, “Toward fair and efficient hyperdimensional computing,” in Proceedings of the 28th Asia and South Pacific Design Automation Conference, 2023, pp. 612–617.
  6. Y. Sheng, J. Yang, L. Yang, Y. Shi, J. Hu, and W. Jiang, “Muffin: A framework toward multi-dimension ai fairness by uniting off-the-shelf models,” arXiv preprint arXiv:2308.13730, 2023.
  7. A. Shafiee, A. Nag, N. Muralimanohar, R. Balasubramonian, J. P. Strachan, M. Hu, R. S. Williams, and V. Srikumar, “Isaac: A convolutional neural network accelerator with in-situ analog arithmetic in crossbars,” ACM SIGARCH Computer Architecture News, vol. 44, no. 3, pp. 14–26, 2016.
  8. W. Jiang, Q. Lou, Z. Yan, L. Yang, J. Hu, X. S. Hu, and Y. Shi, “Device-circuit-architecture co-exploration for computing-in-memory neural accelerators,” IEEE Transactions on Computers, vol. 70, no. 4, pp. 595–605, 2020.
  9. V. Sze, Y.-H. Chen, T.-J. Yang, and J. S. Emer, “Efficient processing of deep neural networks: A tutorial and survey,” Proceedings of the IEEE, vol. 105, no. 12, pp. 2295–2329, 2017.
  10. Y.-H. Chen, J. Emer, and V. Sze, “Eyeriss: A spatial architecture for energy-efficient dataflow for convolutional neural networks,” ACM SIGARCH computer architecture news, vol. 44, no. 3, pp. 367–379, 2016.
  11. L. Yang, Z. Yan, M. Li, H. Kwon, L. Lai, T. Krishna, V. Chandra, W. Jiang, and Y. Shi, “Co-exploration of neural architectures and heterogeneous asic accelerator designs targeting multiple tasks,” in 2020 57th ACM/IEEE Design Automation Conference (DAC).   IEEE, 2020, pp. 1–6.
  12. Z. Yan, X. S. Hu, and Y. Shi, “Computing-in-memory neural network accelerators for safety-critical systems: Can small device variations be disastrous?” in Proceedings of the 41st IEEE/ACM International Conference on Computer-Aided Design, 2022, pp. 1–9.
  13. Z. Yan, Y. Shi, W. Liao, M. Hashimoto, X. Zhou, and C. Zhuo, “When single event upset meets deep neural networks: Observations, explorations, and remedies,” in 2020 25th Asia and South Pacific Design Automation Conference (ASP-DAC).   IEEE, 2020, pp. 163–168.
  14. Z. Yan, D.-C. Juan, X. S. Hu, and Y. Shi, “Uncertainty modeling of emerging device based computing-in-memory neural accelerators with application to neural architecture search,” in 2021 26th Asia and South Pacific Design Automation Conference (ASP-DAC).   IEEE, 2021, pp. 859–864.
  15. Z. Yan, W. Jiang, X. S. Hu, and Y. Shi, “Radars: memory efficient reinforcement learning aided differentiable neural architecture search,” in 2022 27th Asia and South Pacific Design Automation Conference (ASP-DAC).   IEEE, 2022, pp. 128–133.
  16. P. Yao, H. Wu, B. Gao, J. Tang, Q. Zhang, W. Zhang, J. J. Yang, and H. Qian, “Fully hardware-implemented memristor convolutional neural network,” Nature, vol. 577, no. 7792, pp. 641–646, 2020.
  17. W. Shim, J.-s. Seo, and S. Yu, “Two-step write–verify scheme and impact of the read noise in multilevel rram-based inference engine,” Semiconductor Science and Technology, vol. 35, no. 11, p. 115026, 2020.
  18. Y. LeCun, J. Denker, and S. Solla, “Optimal brain damage,” Advances in neural information processing systems, vol. 2, 1989.
  19. Z. Yan, X. S. Hu, and Y. Shi, “Swim: Selective write-verify for computing-in-memory neural accelerators,” in 2022 59th ACM/IEEE Design Automation Conference (DAC).   IEEE, 2022.
  20. B. Feinberg, S. Wang, and E. Ipek, “Making memristive neural network accelerators reliable,” in 2018 IEEE International Symposium on High Performance Computer Architecture (HPCA).   IEEE, 2018, pp. 52–65.
  21. M. J. Rasch, C. Mackin, M. Le Gallo, A. Chen, A. Fasoli, F. Odermatt, N. Li, S. Nandakumar, P. Narayanan, H. Tsai et al., “Hardware-aware training for large-scale and diverse deep learning inference workloads using in-memory computing-based accelerators,” Nature Communications, vol. 14, no. 1, p. 5282, 2023.
  22. Z. Yan, Y. Qin, X. S. Hu, and Y. Shi, “Improving realistic worst-case performance of nvcim dnn accelerators through training with right-censored gaussian noise,” pp. 1–9, 2023.
  23. J. Büchel, A. Vasilopoulos, B. Kersting, F. Odermatt, K. Brew, I. Ok, S. Choi, I. Saraf, V. Chan, T. Philip et al., “Gradient descent-based programming of analog in-memory computing cores,” in 2022 International Electron Devices Meeting (IEDM).   IEEE, 2022, pp. 33–1.
  24. C. Mackin, M. J. Rasch, A. Chen, J. Timcheck, R. L. Bruce, N. Li, P. Narayanan, S. Ambrogio, M. Le Gallo, S. Nandakumar et al., “Optimised weight programming for analogue memory-based deep neural networks,” Nature communications, vol. 13, no. 1, p. 3765, 2022.
  25. W. Wei, G. Zhao, X. Zhan, W. Zhang, P. Sang, Q. Wang, L. Tai, Q. Luo, Y. Li, C. Li et al., “Switching pathway-dependent strain-effects on the ferroelectric properties and structural deformations in orthorhombic hfo2,” Journal of Applied Physics, vol. 131, no. 15, 2022.
  26. Y. Liu, B. Gao, J. Tang, H. Wu, and H. Qian, “Architecture-circuit-technology co-optimization for resistive random access memory-based computation-in-memory chips,” Science China Information Sciences, vol. 66, no. 10, p. 200408, 2023.
  27. X. Peng, S. Huang, Y. Luo, X. Sun, and S. Yu, “Dnn+ neurosim: An end-to-end benchmarking framework for compute-in-memory accelerators with versatile device technologies,” in 2019 IEEE international electron devices meeting (IEDM).   IEEE, 2019, pp. 32–5.
  28. K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2016, pp. 770–778.
Citations (1)

Summary

We haven't generated a summary for this paper yet.