Papers
Topics
Authors
Recent
Detailed Answer
Quick Answer
Concise responses based on abstracts only
Detailed Answer
Well-researched responses based on abstracts and relevant paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses
Gemini 2.5 Flash
Gemini 2.5 Flash 44 tok/s
Gemini 2.5 Pro 41 tok/s Pro
GPT-5 Medium 13 tok/s Pro
GPT-5 High 15 tok/s Pro
GPT-4o 86 tok/s Pro
Kimi K2 208 tok/s Pro
GPT OSS 120B 447 tok/s Pro
Claude Sonnet 4 36 tok/s Pro
2000 character limit reached

Communication-Optimal Parallel Standard and Karatsuba Integer Multiplication in the Distributed Memory Model (2009.14590v1)

Published 30 Sep 2020 in cs.DC and cs.DS

Abstract: We present COPSIM a parallel implementation of standard integer multiplication for the distributed memory setting, and COPK a parallel implementation of Karatsuba's fast integer multiplication algorithm for a distributed memory setting. When using $\mathcal{P}$ processors, each equipped with a local non-shared memory, to compute the product of tho $n$-digits integer numbers, under mild conditions, our algorithms achieve optimal speedup of the computational time. That is, $\mathcal{O}\left(n2/\mathcal{P}\right)$ for COPSIM, and $\mathcal{O}\left(n{\log_2 3}/\mathcal{P}\right)$ for COPK. The total amount of memory required across the processors is $\mathcal{O}\left(n\right)$, that is, within a constant factor of the minimum space required to store the input values. We rigorously analyze the Input/Output (I/O) cost of the proposed algorithms. We show that their bandwidth cost (i.e., the number of memory words sent or received by at least one processors) matches asymptotically corresponding known I/O lower bounds, and their latency (i.e., the number of messages sent or received in the algorithm's critical execution path) is asymptotically within a multiplicative factor $\mathcal{O}\left(\log2_2 \mathcal{P}\right)$ of the corresponding known I/O lower bounds. Hence, our algorithms are asymptotically optimal with respect to the bandwidth cost and almost asymptotically optimal with respect to the latency cost.

Citations (3)

Summary

We haven't generated a summary for this paper yet.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

Lightbulb On Streamline Icon: https://streamlinehq.com

Continue Learning

We haven't generated follow-up questions for this paper yet.

Authors (1)