Papers
Topics
Authors
Recent
Assistant
AI Research Assistant
Well-researched responses based on relevant abstracts and paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses.
Gemini 2.5 Flash
Gemini 2.5 Flash 154 tok/s
Gemini 2.5 Pro 47 tok/s Pro
GPT-5 Medium 33 tok/s Pro
GPT-5 High 34 tok/s Pro
GPT-4o 25 tok/s Pro
Kimi K2 190 tok/s Pro
GPT OSS 120B 419 tok/s Pro
Claude Sonnet 4.5 38 tok/s Pro
2000 character limit reached

On the Performance and Energy Efficiency of the PGAS Programming Model on Multicore Architectures (1912.12700v1)

Published 29 Dec 2019 in cs.DC, cs.PF, and cs.PL

Abstract: Using large-scale multicore systems to get the maximum performance and energy efficiency with manageable programmability is a major challenge. The partitioned global address space (PGAS) programming model enhances programmability by providing a global address space over large-scale computing systems. However, so far the performance and energy efficiency of the PGAS model on multicore-based parallel architectures have not been investigated thoroughly. In this paper we use a set of selected kernels from the well-known NAS Parallel Benchmarks to evaluate the performance and energy efficiency of the UPC programming language, which is a widely used implementation of the PGAS model. In addition, the MPI and OpenMP versions of the same parallel kernels are used for comparison with their UPC counterparts. The investigated hardware platforms are based on multicore CPUs, both within a single 16-core node and across multiple nodes involving up to 1024 physical cores. On the multi-node platform we used the hardware measurement solution called High definition Energy Efficiency Monitoring tool in order to measure energy. On the single-node system we used the hybrid measurement solution to make an effort into understanding the observed performance differences, we use the Intel Performance Counter Monitor to quantify in detail the communication time, cache hit/miss ratio and memory usage. Our experiments show that UPC is competitive with OpenMP and MPI on single and multiple nodes, with respect to both the performance and energy efficiency.

Citations (4)

Summary

We haven't generated a summary for this paper yet.

Dice Question Streamline Icon: https://streamlinehq.com

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Lightbulb Streamline Icon: https://streamlinehq.com

Continue Learning

We haven't generated follow-up questions for this paper yet.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.