Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
104 tokens/sec
GPT-4o
12 tokens/sec
Gemini 2.5 Pro Pro
40 tokens/sec
o3 Pro
5 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Implementing Performance Portability of High Performance Computing Programs in the New Golden Age of Chip Architecture (2308.13802v1)

Published 26 Aug 2023 in cs.AR

Abstract: As an important goal of high-performance computing, the concept of performance portability has been around for many years. As the failure of Moore's Law, it is no longer feasible to improve computer performance by simply increasing the number of existing hardware. The innovation of high performance computer is imperative, which makes high-performance computers with multiple architectures coexist in the production environment. For example, current high-performance computing nodes often use co-accelerators such like general-purpose GPUs and Intel Xeon Phis to accelerate general-purpose processors. With the flourishing of deep learning, dedicated neural network acceleration chips are also arising. The emergence of co-accelerators with different architectures and their wide application in high-performance computers have challenged the performance portability of programs between high-performance computers with different architectures. This article summarizes the current performance portability technology from the programming model, serial code automatic parallelization, parallel code automatic conversion, etc. at the end of the article, it also summarizes how to use scientific computing function libraries to improve performance and performance portability of a program. Different application scenarios need different implementation technologies to get performance portability. Program developers choose performance portability solutions for their programs. In fact, they balance programming efficiency and optimization effects under various constraints.

Summary

We haven't generated a summary for this paper yet.