GraphScale: Scalable Bandwidth-Efficient Graph Processing on FPGAs

Published 16 Jun 2022 in cs.AR and cs.DB | (2206.08432v1)

Abstract: Recent advances in graph processing on FPGAs promise to alleviate performance bottlenecks with irregular memory access patterns. Such bottlenecks challenge performance for a growing number of important application areas like machine learning and data analytics. While FPGAs denote a promising solution through flexible memory hierarchies and massive parallelism, we argue that current graph processing accelerators either use the off-chip memory bandwidth inefficiently or do not scale well across memory channels. In this work, we propose GraphScale, a scalable graph processing framework for FPGAs. For the first time, GraphScale combines multi-channel memory with asynchronous graph processing (i.e., for fast convergence on results) and a compressed graph representation (i.e., for efficient usage of memory bandwidth and reduced memory footprint). GraphScale solves common graph problems like breadth-first search, PageRank, and weakly-connected components through modular user-defined functions, a novel two-dimensional partitioning scheme, and a high-performance two-level crossbar design.