Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
167 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

OISA: Architecting an Optical In-Sensor Accelerator for Efficient Visual Computing (2311.18655v1)

Published 30 Nov 2023 in cs.AR and eess.SP

Abstract: Targeting vision applications at the edge, in this work, we systematically explore and propose a high-performance and energy-efficient Optical In-Sensor Accelerator architecture called OISA for the first time. Taking advantage of the promising efficiency of photonic devices, the OISA intrinsically implements a coarse-grained convolution operation on the input frames in an innovative minimum-conversion fashion in low-bit-width neural networks. Such a design remarkably reduces the power consumption of data conversion, transmission, and processing in the conventional cloud-centric architecture as well as recently-presented edge accelerators. Our device-to-architecture simulation results on various image data-sets demonstrate acceptable accuracy while OISA achieves 6.68 TOp/s/W efficiency. OISA reduces power consumption by a factor of 7.9 and 18.4 on average compared with existing electronic in-/near-sensor and ASIC accelerators.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (33)
  1. R. Song et al., “A reconfigurable convolution-in-pixel cmos image sensor architecture,” IEEE TCSVT, 2022.
  2. H. Xu et al., “Macsen: A processing-in-sensor architecture integrating mac operations into image sensor for ultra-low-power bnn-based intelligent visual perception,” IEEE TCAS II, vol. 68, pp. 627–631, 2020.
  3. S. Angizi et al., “Pisa: A non-volatile processing-in-sensor accelerator for imaging systems,” IEEE TETC, 2023.
  4. K.-T. Tang et al., “Considerations of integrating computing-in-memory and processing-in-sensor into convolutional neural network accelerators for low-power edge devices,” in Symposium on VLSI.   IEEE, 2019.
  5. S. Angizi et al., “A near-sensor processing accelerator for approximate local binary pattern networks,” IEEE Transactions on Emerging Topics in Computing, 2023.
  6. T.-H. Hsu et al., “AI edge devices using computing-in-memory and processing-in-sensor: from system to device,” in IEDM, 2019.
  7. M. Morsali et al., “Design and evaluation of a near-sensor magneto-electric fet-based event detector,” IEEE Transactions on Electron Devices, 2023.
  8. T.-H. Hsu, Y.-R. Chen et al., “A 0.5-v real-time computational cmos image sensor with programmable kernel for feature extraction,” IEEE JSSC, vol. 56, pp. 1588–1596, 2020.
  9. T. Yamazaki et al., “4.9 a 1ms high-speed vision chip with 3d-stacked 140gops column-parallel pes for spatio-temporal image processing,” in ISSCC.   IEEE, 2017, pp. 82–83.
  10. S. Angizi et al., “Cmp-pim: an energy-efficient comparator-based processing-in-memory neural network accelerator,” in Proceedings of the 55th Annual Design Automation Conference, 2018, pp. 1–6.
  11. S. Angizi, Z. He et al., “Mrima: An mram-based in-memory accelerator,” IEEE TCAD, vol. 39, no. 5, pp. 1123–1136, 2019.
  12. H. Xu et al., “Senputing: An ultra-low-power always-on vision perception chip featuring the deep fusion of sensing and computing,” IEEE TCASI: Regular Papers, 2021.
  13. S. Tabrizchi et al., “Appcip: Energy-efficient approximate convolution-in-pixel scheme for neural network acceleration,” IEEE JETCAS, vol. 13, pp. 225–236, 2023.
  14. A. El Gamal et al., “Pixel-level processing: why, what, and how?” in Sensors, Cameras, and Applications for Digital Photography, vol. 3650.   SPIE, 1999, pp. 2–13.
  15. J. Choi et al., “An energy/illumination-adaptive cmos image sensor with reconfigurable modes of operations,” IEEE JSCC, vol. 50, no. 6, pp. 1438–1450, 2015.
  16. A. Roohi et al., “Pipsim: A behavior-level modeling tool for cnn processing-in-pixel accelerators,” IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 2023.
  17. F. P. Sunny et al., “Robin: A robust optical binary neural network accelerator,” ACM TECS, no. 5s, pp. 1–24, 2021.
  18. F. Sunny et al., “Crosslight: A cross-layer optimized silicon photonic neural network accelerator,” in DAC.   IEEE, 2021, pp. 1069–1074.
  19. Q. Cheng et al., “Silicon photonics codesign for deep learning,” Proceedings of the IEEE, vol. 108, pp. 1261–1282, 2020.
  20. H. Xu et al., “Utilizing direct photocurrent computation and 2d kernel scheduling to improve in-sensor-processing efficiency,” in DAC, 2020.
  21. M. Lefebvre et al., “7.7 a 0.2-to-3.6 tops/w programmable convolutional imager soc with in-sensor current-domain ternary-weighted mac operations for feature extraction and region-of-interest detection,” in ISSCC, vol. 64.   IEEE, 2021, pp. 118–120.
  22. F. P. Sunny et al., “Arxon: A framework for approximate communication over photonic networks-on-chip,” TVLSI, vol. 29, pp. 1206–1219, 2021.
  23. Z. Zhao et al., “Hardware-software co-design of slimmed optical neural networks,” in ASP-DAC, 2019, pp. 705–710.
  24. D. Breuer et al., “Comparison of nrz- and rz-modulation format for 40-gb/s tdm standard-fiber systems,” IEEE Photonics Technology Letters, vol. 9, no. 3, pp. 398–400, 1997.
  25. Z. Zhong et al., “Lightning: A reconfigurable photonic-electronic smartnic for fast and energy-efficient inference,” in ACM SIGCOMM 2023 Conference, 2023, pp. 452–472.
  26. (2011) Ncsu eda freepdk45. [Online]. Available: http://www.eda.ncsu.edu/wiki/FreePDK45:Contents
  27. S. Thoziyoor, N. Muralimanohar, J. H. Ahn, and N. P. Jouppi, “Cacti 5.1,” Technical Report HPL-2008-20, HP Labs, Tech. Rep., 2008.
  28. X. Dong et al., “Nvsim: A circuit-level performance, energy, and area model for emerging nonvolatile memory,” IEEE TCAD, vol. 31, pp. 994–1007, 2012.
  29. Y. Chen et al., “Dadiannao: A machine-learning supercomputer,” in Micro.   IEEE, 2014, pp. 609–622.
  30. K. S. Kaur et al., “Flip-chip bonding of vcsels to silicon grating couplers via su8 prisms fabricated using laser ablation,” 2015 European Conference on Optical Communication (ECOC), pp. 1–3, 2015.
  31. S. Park et al., “7.2 243.3 pj/pixel bio-inspired time-stamp-based 2d optic flow sensor for artificial compound eyes,” in IEEE ISSCC.   IEEE, 2014.
  32. S. J. Carey et al., “A 100,000 fps vision sensor with embedded 535gops/w 256×\times× 256 simd processor array,” in Symposium on VLSI.   IEEE, 2013.
  33. P. Guo et al., “Fbna: A fully binarized neural network accelerator,” in FPL.   IEEE, 2018, pp. 51–513.
Citations (2)

Summary

We haven't generated a summary for this paper yet.