Papers
Topics
Authors
Recent
Assistant
AI Research Assistant
Well-researched responses based on relevant abstracts and paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses.
Gemini 2.5 Flash
Gemini 2.5 Flash 134 tok/s
Gemini 2.5 Pro 41 tok/s Pro
GPT-5 Medium 25 tok/s Pro
GPT-5 High 26 tok/s Pro
GPT-4o 58 tok/s Pro
Kimi K2 194 tok/s Pro
GPT OSS 120B 427 tok/s Pro
Claude Sonnet 4.5 37 tok/s Pro
2000 character limit reached

Heterogeneity-aware Gradient Coding for Straggler Tolerance (1901.09339v1)

Published 27 Jan 2019 in cs.DC, cs.IT, and math.IT

Abstract: Gradient descent algorithms are widely used in machine learning. In order to deal with huge volume of data, we consider the implementation of gradient descent algorithms in a distributed computing setting where multiple workers compute the gradient over some partial data and the master node aggregates their results to obtain the gradient over the whole data. However, its performance can be severely affected by straggler workers. Recently, some coding-based approaches are introduced to mitigate the straggler problem, but they are efficient only when the workers are homogeneous, i.e., having the same computation capabilities. In this paper, we consider that the workers are heterogeneous which are common in modern distributed systems. We propose a novel heterogeneity-aware gradient coding scheme which can not only tolerate a predetermined number of stragglers but also fully utilize the computation capabilities of heterogeneous workers. We show that this scheme is optimal when the computation capabilities of workers are estimated accurately. A variant of this scheme is further proposed to improve the performance when the estimations of the computation capabilities are not so accurate. We conduct our schemes for gradient descent based image classification on QingCloud clusters. Evaluation results show that our schemes can reduce the whole computation time by up to $3\times$ compared with a state-of-the-art coding scheme.

Citations (20)

Summary

We haven't generated a summary for this paper yet.

Dice Question Streamline Icon: https://streamlinehq.com

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Lightbulb Streamline Icon: https://streamlinehq.com

Continue Learning

We haven't generated follow-up questions for this paper yet.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

Don't miss out on important new AI/ML research

See which papers are being discussed right now on X, Reddit, and more:

“Emergent Mind helps me see which AI papers have caught fire online.”

Philip

Philip

Creator, AI Explained on YouTube