Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
126 tokens/sec
GPT-4o
47 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

BLISlab: A Sandbox for Optimizing GEMM (1609.00076v1)

Published 1 Sep 2016 in cs.MS

Abstract: Matrix-matrix multiplication is a fundamental operation of great importance to scientific computing and, increasingly, machine learning. It is a simple enough concept to be introduced in a typical high school algebra course yet in practice important enough that its implementation on computers continues to be an active research topic. This note describes a set of exercises that use this operation to illustrate how high performance can be attained on modern CPUs with hierarchical memories (multiple caches). It does so by building on the insights that underly the BLAS-like Library Instantiation Software (BLIS) framework by exposing a simplified "sandbox" that mimics the implementation in BLIS. As such, it also becomes a vehicle for the "crowd sourcing" of the optimization of BLIS. We call this set of exercises BLISlab.

Citations (12)

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com