Emergent Mind

Abstract

In data-parallel computing frameworks, intermediate parallel data is often produced at various stages which needs to be transferred among servers in the datacenter network (e.g. the shuffle phase in MapReduce). A stage often cannot start or be completed unless all the required data pieces from the preceding stage are received. \emph{Coflow} is a recently proposed networking abstraction to capture such communication patterns. We consider the problem of efficiently scheduling coflows with release dates in a shared datacenter network so as to minimize the total weighted completion time of coflows. Several heuristics have been proposed recently to address this problem, as well as a few polynomial-time approximation algorithms with provable performance guarantees. Our main result in this paper is a polynomial-time deterministic algorithm that improves the prior known results. Specifically, we propose a deterministic algorithm with approximation ratio of $5$, which improves the prior best known ratio of $12$. For the special case when all coflows are released at time zero, our deterministic algorithm obtains approximation ratio of $4$ which improves the prior best known ratio of $8$. The key ingredient of our approach is an improved linear program formulation for sorting the coflows followed by a simple list scheduling policy. Extensive simulation results, using both synthetic and real traffic traces, are presented that verify the performance of our algorithm and show improvement over the prior approaches.

We're not able to analyze this paper right now due to high demand.

Please check back later (sorry!).

Generate a summary of this paper on our Pro plan:

We ran into a problem analyzing this paper.

Newsletter

Get summaries of trending comp sci papers delivered straight to your inbox:

Unsubscribe anytime.