Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
167 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

De-specializing an HLS library for Deep Neural Networks: improvements upon hls4ml (2103.13060v1)

Published 24 Mar 2021 in cs.AR

Abstract: Custom hardware accelerators for Deep Neural Networks are increasingly popular: in fact, the flexibility and performance offered by FPGAs are well-suited to the computational effort and low latency constraints required by many image recognition and natural language processing tasks. The gap between high-level Machine Learning frameworks (e.g., Tensorflow, Pytorch) and low-level hardware design in Verilog/VHDL creates a barrier to widespread adoption of FPGAs, which can be overcome with the help of High-Level Synthesis. hls4ml is a framework that translates Deep Neural Networks into annotated C++ code for High-Level Synthesis, offering a complete and user-friendly design process that has been enthusiastically adopted in physics research. We analyze the strengths and weaknesses of hls4ml, drafting a plan to enhance its core library of components in order to allow more advanced optimizations, target a wider selection of FPGAs, and support larger Neural Network models.

Citations (1)

Summary

We haven't generated a summary for this paper yet.