NeuMMU: Architectural Support for Efficient Address Translations in Neural Processing Units (1911.06859v1)

Published 15 Nov 2019 in cs.AR, cs.DC, cs.LG, and cs.NE

Abstract: To satisfy the compute and memory demands of deep neural networks, neural processing units (NPUs) are widely being utilized for accelerating deep learning algorithms. Similar to how GPUs have evolved from a slave device into a mainstream processor architecture, it is likely that NPUs will become first class citizens in this fast-evolving heterogeneous architecture space. This paper makes a case for enabling address translation in NPUs to decouple the virtual and physical memory address space. Through a careful data-driven application characterization study, we root-cause several limitations of prior GPU-centric address translation schemes and propose a memory management unit (MMU) that is tailored for NPUs. Compared to an oracular MMU design point, our proposal incurs only an average 0.06% performance overhead.

Citations (26)

View on Semantic Scholar

Summary

We haven't generated a summary for this paper yet.

Summarize Now

Tweets

https://twitter.com/MuzafferKal_/status/1786124477807673582

NeuMMU: Architectural Support for Efficient Address Translations in Neural Processing Units (1911.06859v1)

Summary

Related Papers

Tweets