Papers
Topics
Authors
Recent
Assistant
AI Research Assistant
Well-researched responses based on relevant abstracts and paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses.
Gemini 2.5 Flash
Gemini 2.5 Flash 134 tok/s
Gemini 2.5 Pro 41 tok/s Pro
GPT-5 Medium 28 tok/s Pro
GPT-5 High 39 tok/s Pro
GPT-4o 101 tok/s Pro
Kimi K2 191 tok/s Pro
GPT OSS 120B 428 tok/s Pro
Claude Sonnet 4.5 37 tok/s Pro
2000 character limit reached

daBNN: A Super Fast Inference Framework for Binary Neural Networks on ARM devices (1908.05858v1)

Published 16 Aug 2019 in cs.CV, cs.MM, and eess.IV

Abstract: It is always well believed that Binary Neural Networks (BNNs) could drastically accelerate the inference efficiency by replacing the arithmetic operations in float-valued Deep Neural Networks (DNNs) with bit-wise operations. Nevertheless, there has not been open-source implementation in support of this idea on low-end ARM devices (e.g., mobile phones and embedded devices). In this work, we propose daBNN --- a super fast inference framework that implements BNNs on ARM devices. Several speed-up and memory refinement strategies for bit-packing, binarized convolution, and memory layout are uniquely devised to enhance inference efficiency. Compared to the recent open-source BNN inference framework, BMXNet, our daBNN is $7\times$$\sim$$23\times$ faster on a single binary convolution, and about $6\times$ faster on Bi-Real Net 18 (a BNN variant of ResNet-18). The daBNN is a BSD-licensed inference framework, and its source code, sample projects and pre-trained models are available on-line: https://github.com/JDAI-CV/dabnn.

Citations (62)

Summary

  • The paper introduces daBNN, a framework that accelerates binary neural network inference on ARM devices by optimizing bit-packing, convolution, and memory layout.
  • The paper demonstrates up to 23× faster performance over existing methods by leveraging SIMD instructions and a redesigned binary direct convolution approach.
  • The framework’s open-source BSD license and efficient design promote practical deployment and further exploration of novel BNN architectures on constrained hardware.

daBNN: A Super Fast Inference Framework for Binary Neural Networks on ARM Devices

The paper, titled "daBNN: A Super Fast Inference Framework for Binary Neural Networks on ARM devices," introduces a highly optimized inference framework specifically designed to implement Binary Neural Networks (BNNs) on ARM devices. This research addresses the limitations of executing DNNs on low-end devices, such as mobile phones, due to their constrained memory and computational capabilities. BNNs offer a potential solution by quantizing weights and activations to binary values, thus facilitating efficient inference through bit-wise operations.

Key Contributions

The primary contribution of this research is the development of the daBNN framework, which is characteristically faster than existing alternatives. Notable innovations include:

  1. Bit-Packing Optimization: The authors present an enhanced bit-packing scheme that significantly outperforms traditional methods by employing SIMD instructions to aggregate multiple elements simultaneously, reducing the latency by fourfold compared to naive sequential approaches.
  2. Binary Direct Convolution: This method is proposed to improve the inefficiencies found in traditional binary matrix multiplications (BGEMM) used in BNNs. By modifying the calculation order, daBNN reduces the overhead of additional instructions, specifically achieving better performance on ARMv8 and ARMv7 architectures.
  3. Memory Layout Refinement: Introducing a new NC\textsubscript{1}HWC\textsubscript{2} memory layout leverages spatial redundancies in convolution operations, thereby decreasing memory access by approximately two-thirds compared to conventional layouts.

Performance Evaluation

Empirical evaluations demonstrate that daBNN achieves substantial improvements in inference speed, specifically being 7× to 23× faster than BMXNet on single binary convolution tasks and roughly 6× faster on Bi-Real Net 18. Compared to TensorFlow Lite, daBNN shows an 8× to 10× performance increase in single binary convolution and offers a 3× improvement with Bi-Real Net 18. Such results underline the efficiency of the framework in practical deployment scenarios.

Implications and Future Work

The implications of this research are twofold:

  • Practical Deployability: daBNN provides an open-source, BSD-licensed solution for deploying binary networks on ARM devices, encouraging broader adoption and experimentation in industry environments where computational efficiency is crucial.
  • Research Opportunities: The framework's availability facilitates the exploration and design of novel BNN structures, offering insight into more computationally efficient architectures.

Looking forward, the authors express an interest in expanding the architecture support of daBNN to x86 and RISC-V platforms. Furthermore, collaboration opportunities with research teams to innovate and refine BNN structures are anticipated as future directions.

This research contributes a significant advancement in executing neural networks efficiently on constrained hardware, offering both a practical tool for deployment and a foundation for continued innovation in BNN design. The availability of daBNN as an open-source project further cements its potential influence, encouraging developers and researchers alike to leverage and build upon this framework in advancing neural network applications on low-end devices.

Dice Question Streamline Icon: https://streamlinehq.com

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

Github Logo Streamline Icon: https://streamlinehq.com

Don't miss out on important new AI/ML research

See which papers are being discussed right now on X, Reddit, and more:

“Emergent Mind helps me see which AI papers have caught fire online.”

Philip

Philip

Creator, AI Explained on YouTube