Papers
Topics
Authors
Recent
2000 character limit reached

Towards Stochastically Optimizing Data Computing Flows (1607.04334v3)

Published 14 Jul 2016 in cs.DC

Abstract: With rapid growth in the amount of unstructured data produced by memory-intensive applications, large scale data analytics has recently attracted increasing interest. Processing, managing and analyzing this huge amount of data poses several challenges in cloud and data center computing domain. Especially, conventional frameworks for distributed data analytics are based on the assumption of homogeneity and non-stochastic distribution of different data-processing nodes. The paper argues the fundamental limiting factors for scaling big data computation. It is shown that as the number of series and parallel computing servers increase, the tail (mean and variance) of the job execution time increase. We will first propose a model to predict the response time of highly distributed processing tasks and then propose a new practical computational algorithm to optimize the response time.

Summary

We haven't generated a summary for this paper yet.

Slide Deck Streamline Icon: https://streamlinehq.com

Whiteboard

Dice Question Streamline Icon: https://streamlinehq.com

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Lightbulb Streamline Icon: https://streamlinehq.com

Continue Learning

We haven't generated follow-up questions for this paper yet.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.