Emergent Mind

Abstract

Power and energy consumption is becoming key challenges to deploy the first exascale supercomputer successfully. Large-scale HPC applications waste a significant amount of power in communication and synchronization-related idle times. However, due to the time scale at which communication happens, transitioning in low power states during communication's idle times may introduce unacceptable overhead in applications' execution time. In this paper, we present COUNTDOWN, a runtime library, supported by a methodology and analysis tool for identifying and automatically reducing the power consumption of the computing elements during communication and synchronization. COUNTDOWN saves energy without imposing significant time-to-completion increase by lowering CPUs power consumption only during idle times for which power state transition overhead are negligible. This is done transparently to the user, without requiring labor-intensive and error-prone application code modifications, nor requiring recompilation of the application. We test our methodology in a production Tier-0 system. For the NAS benchmarks, COUNTDOWN saves between 6% and 50% energy, with a time-to-solution penalty lower than 5%. In a complete production Quantum ESPRESSO for a 3.5K cores run, COUNTDOWN saves 22.36% energy, with a performance penalty below 3%. Energy saving increases to 37% with a performance penalty of 6.38%, if the application is executed without communication tuning.

We're not able to analyze this paper right now due to high demand.

Please check back later (sorry!).

Generate a summary of this paper on our Pro plan:

We ran into a problem analyzing this paper.

Newsletter

Get summaries of trending comp sci papers delivered straight to your inbox:

Unsubscribe anytime.