Papers
Topics
Authors
Recent
Detailed Answer
Quick Answer
Concise responses based on abstracts only
Detailed Answer
Well-researched responses based on abstracts and relevant paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses
Gemini 2.5 Flash
Gemini 2.5 Flash 65 tok/s
Gemini 2.5 Pro 47 tok/s Pro
GPT-5 Medium 39 tok/s Pro
GPT-5 High 32 tok/s Pro
GPT-4o 97 tok/s Pro
Kimi K2 164 tok/s Pro
GPT OSS 120B 466 tok/s Pro
Claude Sonnet 4 38 tok/s Pro
2000 character limit reached

Efficient distributed algorithms for Convolutional Neural Networks (2105.13480v2)

Published 27 May 2021 in cs.DC

Abstract: Several efficient distributed algorithms have been developed for matrix-matrix multiplication: the 3D algorithm, the 2D SUMMA algorithm, and the 2.5D algorithm. Each of these algorithms was independently conceived and they trade-off memory needed per node and the inter-node data communication volume. The convolutional neural network (CNN) computation may be viewed as a generalization of matrix-multiplication combined with neighborhood stencil computations. We develop communication-efficient distributed-memory algorithms for CNNs that are analogous to the 2D/2.5D/3D algorithms for matrix-matrix multiplication.

Citations (6)
List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

Summary

We haven't generated a summary for this paper yet.

Dice Question Streamline Icon: https://streamlinehq.com

Follow-Up Questions

We haven't generated follow-up questions for this paper yet.