Papers
Topics
Authors
Recent
Detailed Answer
Quick Answer
Concise responses based on abstracts only
Detailed Answer
Well-researched responses based on abstracts and relevant paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses
Gemini 2.5 Flash
Gemini 2.5 Flash 48 tok/s
Gemini 2.5 Pro 48 tok/s Pro
GPT-5 Medium 26 tok/s Pro
GPT-5 High 19 tok/s Pro
GPT-4o 107 tok/s Pro
Kimi K2 205 tok/s Pro
GPT OSS 120B 473 tok/s Pro
Claude Sonnet 4 38 tok/s Pro
2000 character limit reached

EncodingNet: A Novel Encoding-based MAC Design for Efficient Neural Network Acceleration (2402.18595v2)

Published 25 Feb 2024 in cs.AR, cs.CE, and cs.LG

Abstract: Deep neural networks (DNNs) have achieved great breakthroughs in many fields such as image classification and natural language processing. However, the execution of DNNs needs to conduct massive numbers of multiply-accumulate (MAC) operations on hardware and thus incurs a large power consumption. To address this challenge, we propose a novel digital MAC design based on encoding. In this new design, the multipliers are replaced by simple logic gates to represent the results with a wide bit representation. The outputs of the new multipliers are added by bit-wise weighted accumulation and the accumulation results are compatible with existing computing platforms accelerating neural networks. Since the multiplication function is replaced by a simple logic representation, the critical paths in the resulting circuits become much shorter. Correspondingly, pipelining stages and intermediate registers used to store partial sums in the MAC array can be reduced, leading to a significantly smaller area as well as better power efficiency. The proposed design has been synthesized and verified by ResNet18- Cifar10, ResNet20-Cifar100, ResNet50-ImageNet, MobileNetV2-Cifar10, MobileNetV2-Cifar100, and EfficientNetB0-ImageNet. The experimental results confirmed the reduction of circuit area by up to 48.79% and the reduction of power consumption of executing DNNs by up to 64.41%, while the accuracy of the neural networks can still be well maintained. The open source code of this work can be found on GitHub with link https://github.com/Bo-Liu-TUM/EncodingNet/.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (30)
  1. https://openai.com/blog/chatgpt.
  2. T. Brown et al., “Language models are few-shot learners,” in Advances in Neural Information Processing Systems (NeurIPS), vol. 33, 2020, pp. 1877–1901.
  3. D. Patterson et al., “The carbon footprint of machine learning training will plateau, then shrink,” Computer, vol. 55, no. 7, pp. 18–28, 2022.
  4. https://www.eia.gov/tools/faqs/faq.php?id=97&t=3.
  5. N. P. Jouppi et al., “In-datacenter performance analysis of a tensor processing unit,” in International Symposium on Computer Architecture (ISCA), 2017.
  6. Y. Chen, T. Chen, Z. Xu, N. Sun, and O. Temam, “Diannao family: Energy-efficient hardware accelerators for machine learning,” Communications of ACM, vol. 59, no. 11, p. 105–112, 2016.
  7. Y.-H. Chen, T. Krishna, J. S. Emer, and V. Sze, “Eyeriss: An energy-efficient reconfigurable accelerator for deep convolutional neural networks,” IEEE Journal of Solid-State Circuits (JSSCC), vol. 52, no. 1, pp. 127–138, 2017.
  8. S. Han, H. Mao, and W. J. Dally, “Deep compression: Compressing deep neural network with pruning, trained quantization and huffman coding,” in International Conference on Learning Representations (ICLR), 2016.
  9. T. Liang, J. Glossner, L. Wang, S. Shi, and X. Zhang, “Pruning and quantization for deep neural network acceleration: A survey,” Neurocomputing, vol. 461, pp. 370–403, 2021.
  10. M. Jiang, J. Wang, A. Eldebiky, X. Yin, C. Zhuo, I.-C. Lin, and G. L. Zhang, “Class-aware pruning for efficient neural networks,” in Design, Automation and Test in Europe (DATE), 2024.
  11. R. Petri, G. L. Zhang, Y. Chen, U. Schlichtmann, and B. Li, “Powerpruning: Selecting weights and activations for power-efficient neural network acceleration,” in Design Automation Conference (DAC), 2023.
  12. G. Hinton, O. Vinyals, and J. Dean, “Distilling the knowledge in a neural network,” Neural Information Processing Systems (NeurIPS), 2014.
  13. Y. Han, G. Huang, S. Song, L. Yang, H. Wang, and Y. Wang, “Dynamic neural networks: A survey,” IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), vol. 44, pp. 7436–7456, 2021.
  14. J. Wang, B. Li, and G. L. Zhang, “Early-exit with class exclusion for efficient inference of neural networks,” in International Conference on Artificial Intelligence Circuits and Systems (AICAS), 2024.
  15. M. Wistuba, A. Rawat, and T. Pedapati, “A survey on neural architecture search,” ArXiv, 2019.
  16. G. Armeniakos, G. Zervakis, D. J. Soudris, and J. Henkel, “Hardware approximate techniques for deep neural network accelerators: A survey,” ACM Computing Surveys, vol. 55, pp. 1–36, 2022.
  17. A. Gholami, S. Kim, Z. Dong, Z. Yao, M. W. Mahoney, and K. Keutzer, “A survey of quantization methods for efficient neural network inference,” ArXiv, 2021.
  18. W. Sun, G. L. Zhang, H. Gu, B. Li, and U. Schlichtmann, “Class-based quantization for neural networks,” in Design, Automation and Test in Europe (DATE), 2023.
  19. N. D. Gundi, T. Shabanian, P. Basu, P. Pandey, S. Roy, K. Chakraborty, and Z. Zhang, “Effort: Enhancing energy efficiency and error resilience of a near-threshold tensor processing unit,” in Asia and South Pacific Design Automation Conference (ASP-DAC), 2020, pp. 241–246.
  20. P. Pandey, N. D. Gundi, K. Chakraborty, and S. Roy, “Uptpu: Improving energy efficiency of a tensor processing unit through underutilization based power-gating,” in Design Automation Conference (DAC), 2021, pp. 325–330.
  21. B. Moons, R. Uytterhoeven, W. Dehaene, and M. Verhelst, “14.5 envision: A 0.26-to-10tops/w subword-parallel dynamic-voltage-accuracy-frequency-scalable convolutional neural network processor in 28nm fdsoi,” in International Solid-State Circuits Conference (ISSCC), 2017, pp. 246–247.
  22. J. Nunez-Yanez, “Energy proportional neural network inference with adaptive voltage and frequency scaling,” IEEE Transactions on Computers (TC), vol. 68, no. 5, pp. 676–687, 2019.
  23. D. Miyashita, E. H. Lee, and B. Murmann, “Convolutional neural networks using logarithmic data representation,” arXiv, 2016.
  24. M. Valueva, N. Nagornov, P. Lyakhov, G. Valuev, and N. Chervyakov, “Application of the residue number system to reduce hardware costs of the convolutional neural network implementation,” Mathematics and Computers in Simulation, vol. 177, pp. 232–243, 2020.
  25. Y. Bengio, N. Léonard, and A. Courville, “Estimating or propagating gradients through stochastic neurons for conditional computation,” arXiv, 2013.
  26. M. Cho, K. Alizadeh-Vahid, S. Adya, and M. Rastegari, “Dkm: Differentiable k-means clustering layer for neural network compression,” in International Conference on Learning Representations (ICLR), 2021.
  27. “15nm Open-Cell library and 45nm freePDK,” https://si2.org/open-cell-library/.
  28. “Pretrained ImageNet models,” https://pytorch.org/vision/stable/models.html.
  29. “Pretrained Cifar10 models,” https://github.com/huyvnphan/PyTorch_CIFAR10.
  30. “Pretrained Cifar100 models,” https://github.com/weiaicunzai/pytorch-cifar100.

Summary

We haven't generated a summary for this paper yet.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

Lightbulb On Streamline Icon: https://streamlinehq.com

Continue Learning

We haven't generated follow-up questions for this paper yet.