Papers
Topics
Authors
Recent
Detailed Answer
Quick Answer
Concise responses based on abstracts only
Detailed Answer
Well-researched responses based on abstracts and relevant paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses
Gemini 2.5 Flash
Gemini 2.5 Flash 37 tok/s
Gemini 2.5 Pro 41 tok/s Pro
GPT-5 Medium 10 tok/s Pro
GPT-5 High 15 tok/s Pro
GPT-4o 84 tok/s Pro
Kimi K2 198 tok/s Pro
GPT OSS 120B 448 tok/s Pro
Claude Sonnet 4 31 tok/s Pro
2000 character limit reached

Accelerating CNN inference on long vector architectures via co-design (2212.11574v1)

Published 22 Dec 2022 in cs.DC

Abstract: CPU-based inference can be an alternative to off-chip accelerators, and vector architectures are a promising option due to their efficiency. However, the large design space of convolutional algorithms and hardware implementations makes it challenging to select the best options. This paper presents ongoing research into co-designing vector architectures for CPU-based CNN inference, focusing on the im2col+GEMM and Winograd kernels. Using the Gem5 simulator, we examine the impact of various hardware microarchitectural features on RISC-V Vector and ARM-SVE ISAs. We also study the impact of several BLIS-like algorithmic optimizations on im2col+GEMM. Our co-design study shows that longer vector lengths and larger caches can improve performance by 5x with our optimized CNN kernels, compared to a vector length of 512-bit and 1MB of L2 cache. For Winograd, we present a novel approach of inter-tile parallelization that exploits longer vector lengths and offers high memory reuse, resulting in up to 2.4x performance improvement for non-strided convolutional layers with 3x3 kernel size. Our study also shows that Winograd requires smaller cache sizes compared to im2col+GEMM.

Citations (2)

Summary

We haven't generated a summary for this paper yet.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

Lightbulb On Streamline Icon: https://streamlinehq.com

Continue Learning

We haven't generated follow-up questions for this paper yet.