Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
157 tokens/sec
GPT-4o
43 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Algorithmic Energy Saving for Parallel Cholesky, LU, and QR Factorizations (1411.2536v2)

Published 10 Nov 2014 in cs.DC

Abstract: The pressing demands of improving energy efficiency for high performance scientific computing have motivated a large body of software-controlled hard- ware solutions using Dynamic Voltage and Frequency Scaling (DVFS) that strategically switch processors to low-power states, when the peak processor performance is not necessary. Although OS level solutions have demonstrated the effectiveness of saving energy in a black-box fashion, for applications with variable execution characteristics, the optimal energy efficiency can be blundered away due to defective prediction mechanism and untapped load imbalance. In this paper, we propose TX, a library level race-to-halt DVFS scheduling approach that analyzes Task Dependency Set of each task in parallel Cholesky, LU, and QR factorizations to achieve substantial energy savings OS level solutions cannot fulfill. Partially giving up the generality of OS level solutions per requiring library level source modification, TX lever- ages algorithmic characteristics of the applications to gain greater energy savings. Experimental results on two power-aware clusters indicate that TX can save up to 17.8% more energy than state-of-the-art OS level solutions with negligible 3.5% on average performance loss.

Summary

We haven't generated a summary for this paper yet.