Papers
Topics
Authors
Recent
Detailed Answer
Quick Answer
Concise responses based on abstracts only
Detailed Answer
Well-researched responses based on abstracts and relevant paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses
Gemini 2.5 Flash
Gemini 2.5 Flash 47 tok/s
Gemini 2.5 Pro 44 tok/s Pro
GPT-5 Medium 13 tok/s Pro
GPT-5 High 12 tok/s Pro
GPT-4o 64 tok/s Pro
Kimi K2 160 tok/s Pro
GPT OSS 120B 452 tok/s Pro
Claude Sonnet 4 37 tok/s Pro
2000 character limit reached

On Optimizing Operator Fusion Plans for Large-Scale Machine Learning in SystemML (1801.00829v1)

Published 2 Jan 2018 in cs.DB

Abstract: Many large-scale ML systems allow specifying custom ML algorithms by means of linear algebra programs, and then automatically generate efficient execution plans. In this context, optimization opportunities for fused operators---in terms of fused chains of basic operators---are ubiquitous. These opportunities include (1) fewer materialized intermediates, (2) fewer scans of input data, and (3) the exploitation of sparsity across chains of operators. Automatic operator fusion eliminates the need for hand-written fused operators and significantly improves performance for complex or previously unseen chains of operations. However, existing fusion heuristics struggle to find good fusion plans for complex DAGs or hybrid plans of local and distributed operations. In this paper, we introduce an optimization framework for systematically reason about fusion plans that considers materialization points in DAGs, sparsity exploitation, different fusion template types, as well as local and distributed operations. In detail, we contribute algorithms for (1) candidate exploration of valid fusion plans, (2) cost-based candidate selection, and (3) code generation of local and distributed operations over dense, sparse, and compressed data. Our experiments in SystemML show end-to-end performance improvements with optimized fusion plans of up to 21x compared to hand-written fused operators, with negligible optimization and code generation overhead.

Citations (54)

Summary

We haven't generated a summary for this paper yet.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

Lightbulb On Streamline Icon: https://streamlinehq.com

Continue Learning

We haven't generated follow-up questions for this paper yet.