Emergent Mind

Improving Locality in Sparse and Dense Matrix Multiplications

(2407.00243)
Published Jun 28, 2024 in cs.DC

Abstract

Consecutive matrix multiplications are commonly used in graph neural networks and sparse linear solvers. These operations frequently access the same matrices for both reading and writing. While reusing these matrices improves data locality, it presents a challenge due to the irregular dependencies between iterations across the two multiplication operations. Existing fusion methods often introduce excessive synchronization overhead or overlapped computations with limited benefits. This paper proposes tile fusion, a runtime approach that fuses tiles of the two matrix-matrix multiplications, where at least one of the involved matrices is sparse. Tile fusion aims to improve data locality while providing sufficient workload for cores in shared-memory multi-core processors. For a pair of matrix-matrix multiplications, tile fusion outperforms unfused baseline and MKL implementations with a geometric mean speedup of 1.97$\times$ 1.64$\times$, respectively, on multi-core CPUs.

We're not able to analyze this paper right now due to high demand.

Please check back later (sorry!).

Generate a summary of this paper on our Pro plan:

We ran into a problem analyzing this paper.

Newsletter

Get summaries of trending comp sci papers delivered straight to your inbox:

Unsubscribe anytime.