ONNXim: A Fast, Cycle-level Multi-core NPU Simulator (2406.08051v1)

Published 12 Jun 2024 in cs.AR and cs.PF

Abstract: As DNNs are widely adopted in various application domains while demanding increasingly higher compute and memory requirements, designing efficient and performant NPUs (Neural Processing Units) is becoming more important. However, existing architectural NPU simulators lack support for high-speed simulation, multi-core modeling, multi-tenant scenarios, detailed DRAM/NoC modeling, and/or different deep learning frameworks. To address these limitations, this work proposes ONNXim, a fast cycle-level simulator for multi-core NPUs in DNN serving systems. It takes DNN models represented in the ONNX graph format generated from various deep learning frameworks for ease of simulation. In addition, based on the observation that typical NPU cores process tensor tiles from on-chip scratchpad memory with deterministic compute latency, we forgo a detailed modeling for the computation while still preserving simulation accuracy. ONNXim also preserves dependencies between compute and tile DMAs. Meanwhile, the DRAM and NoC are modeled in cycle-level to properly model contention among multiple cores that can execute different DNN models for multi-tenancy. Consequently, ONNXim is significantly faster than existing simulators (e.g., by up to 384x over Accel-sim) and enables various case studies, such as multi-tenant NPUs, that were previously impractical due to slow speed and/or lack of functionalities. ONNXim is publicly available at https://github.com/PSAL-POSTECH/ONNXim.

Authors (8)

Hyungkyu Ham (3 papers)
Wonhyuk Yang (3 papers)
Yunseon Shin (2 papers)
Okkyun Woo (2 papers)
Guseul Heo (6 papers)
Sangyeop Lee (13 papers)
Jongse Park (14 papers)
Gwangsun Kim (4 papers)

Citations (2)

View on Semantic Scholar

Summary

We haven't generated a summary for this paper yet.

Summarize Now

GitHub

GitHub - PSAL-POSTECH/ONNXim: ONNXim is a fast cycle-level simulator that can model multi-core NPUs for DNN inference (62 stars)

ONNXim: A Fast, Cycle-level Multi-core NPU Simulator (2406.08051v1)

Summary

Related Papers

GitHub