Emergent Mind

Abstract

Recently, Graph Neural Networks (GNNs) have become state-of-the-art algorithms for analyzing non-euclidean graph data. However, to realize efficient GNN training is challenging, especially on large graphs. The reasons are many-folded: 1) GNN training incurs a substantial memory footprint. Full-batch training on large graphs even requires hundreds to thousands of gigabytes of memory. 2) GNN training involves both memory-intensive and computation-intensive operations, challenging current CPU/GPU platforms. 3) The irregularity of graphs can result in severe resource under-utilization and load-imbalance problems. This paper presents a GNNear accelerator to tackle these challenges. GNNear adopts a DIMM-based memory system to provide sufficient memory capacity. To match the heterogeneous nature of GNN training, we offload the memory-intensive Reduce operations to in-DIMM Near-Memory-Engines (NMEs), making full use of the high aggregated local bandwidth. We adopt a Centralized-Acceleration-Engine (CAE) to process the computation-intensive Update operations. We further propose several optimization strategies to deal with the irregularity of input graphs and improve GNNear's performance. Comprehensive evaluations on 16 GNN training tasks demonstrate that GNNear achieves 30.8$\times$/2.5$\times$ geomean speedup and 79.6$\times$/7.3$\times$(geomean) higher energy efficiency compared to Xeon E5-2698-v4 CPU and NVIDIA V100 GPU.

We're not able to analyze this paper right now due to high demand.

Please check back later (sorry!).

Generate a summary of this paper on our Pro plan:

We ran into a problem analyzing this paper.

Newsletter

Get summaries of trending comp sci papers delivered straight to your inbox:

Unsubscribe anytime.