Papers
Topics
Authors
Recent
2000 character limit reached

Identifying the potential of Near Data Computing for Apache Spark (1707.09323v1)

Published 9 May 2017 in cs.DC

Abstract: While cluster computing frameworks are continuously evolving to provide real-time data analysis capabilities, Apache Spark has managed to be at the forefront of big data analytics for being a unified framework for both, batch and stream data processing. There is also a renewed interest is Near Data Computing (NDC) due to technological advancement in the last decade. However, it is not known if NDC architectures can improve the performance of big data processing frameworks such as Apache Spark. In this position paper, we hypothesize in favour of NDC architecture comprising programmable logic based hybrid 2D integrated processing-in-memory and in-storage processing for Apache Spark, by extensive profiling of Apache Spark based workloads on Ivy Bridge Server.

Summary

We haven't generated a summary for this paper yet.

Slide Deck Streamline Icon: https://streamlinehq.com

Whiteboard

Dice Question Streamline Icon: https://streamlinehq.com

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Lightbulb Streamline Icon: https://streamlinehq.com

Continue Learning

We haven't generated follow-up questions for this paper yet.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.