Papers
Topics
Authors
Recent
Assistant
AI Research Assistant
Well-researched responses based on relevant abstracts and paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses.
Gemini 2.5 Flash
Gemini 2.5 Flash 73 tok/s
Gemini 2.5 Pro 57 tok/s Pro
GPT-5 Medium 26 tok/s Pro
GPT-5 High 24 tok/s Pro
GPT-4o 94 tok/s Pro
Kimi K2 198 tok/s Pro
GPT OSS 120B 454 tok/s Pro
Claude Sonnet 4.5 33 tok/s Pro
2000 character limit reached

The I/O complexity of hybrid algorithms for square matrix multiplication (1904.12804v1)

Published 29 Apr 2019 in cs.DS

Abstract: Asymptotically tight lower bounds are derived for the I/O complexity of a general class of hybrid algorithms computing the product of $n \times n$ square matrices combining \emph{Strassen-like}'' fast matrix multiplication approach with computational complexity $\Theta{n^{\log_2 7}}$, and\emph{standard}'' matrix multiplication algorithms with computational complexity $\Omega\left(n3\right)$. We present a novel and tight $\Omega\left(\left(\frac{n}{\max{\sqrt{M},n_0}}\right){\log_2 7}\left(\max{1,\frac{n_0}{M}}\right)3M\right)$ lower bound for the I/O complexity a class of \emph{uniform, non-stationary}'' hybrid algorithms when executed in a two-level storage hierarchy with $M$ words of fast memory, where $n_0$ denotes the threshold size of sub-problems which are computed using standard algorithms with algebraic complexity $\Omega\left(n^3\right)$. The lower bound is actually derived for the more general class of\emph{non-uniform, non-stationary}'' hybrid algorithms which allow recursive calls to have a different structure, even when they refer to the multiplication of matrices of the same size and in the same recursive level, although the quantitative expressions become more involved. Our results are the first I/O lower bounds for these classes of hybrid algorithms. All presented lower bounds apply even if the recomputation of partial results is allowed and are asymptotically tight. The proof technique combines the analysis of the Grigoriev's flow of the matrix multiplication function, combinatorial properties of the encoding functions used by fast Strassen-like algorithms, and an application of the Loomis-Whitney geometric theorem for the analysis of standard matrix multiplication algorithms. Extensions of the lower bounds for a parallel model with $P$ processors are also discussed.

Citations (5)

Summary

We haven't generated a summary for this paper yet.

Lightbulb Streamline Icon: https://streamlinehq.com

Continue Learning

We haven't generated follow-up questions for this paper yet.

Authors (1)

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.