skip to main content
10.1145/3030207.3030230acmconferencesArticle/Chapter ViewAbstractPublication PagesicpeConference Proceedingsconference-collections
research-article

TARUC: A Topology-Aware Resource Usability and Contention Benchmark

Published: 17 April 2017 Publication History

Abstract

Computer architects have increased hardware parallelism and power efficiency by integrating massively parallel hardware accelerators (coprocessors) into compute systems. Many modern HPC clusters now consist of multi-CPU nodes along with additional hardware accelerators in the form of graphics processing units (GPUs). Each CPU and GPU is integrated with system memory via communication links (QPI and PCIe) and multi-channel memory controllers. The increasing density of these heterogeneous computing systems has resulted in complex performance phenomena including non-uniform memory access (NUMA) and resource contention that make application performance hard to predict and tune. This paper presents the Topology Aware Resource Usability and Contention (TARUC) benchmark. TARUC is a modular, open-source, and highly configurable benchmark useful for profiling dense heterogeneous systems to provide insight for developers who wish to tune application codes for specific systems. Analysis of TARUC performance profiles from a multi-CPU, multi-GPU system is also presented.

References

[1]
G. Baker. An emperical study of contention and NUMA effects on heterogeneous computing systems. Master's thesis, California Polytechnic State University, June 2016.
[2]
L. Bergstrom. Measuring NUMA Effects With the STREAM Benchmark. arXiv preprint arXiv:1103.3225, 2011.
[3]
F. Gaud, B. Lepers, J. Decouchant, J. Funston, A. Fedorova, and V. Quéma. Large Pages May be Harmful on NUMA Systems. In 2014 USENIX Annual Technical Conference (USENIX ATC 14), pages 231--242, 2014.
[4]
Intel. White paper: An Introduction to the ÂŹQuickPath Interconnect. Technical report, Intel Corporation, January 2009.
[5]
P. Jacob, A. Zia, O. Erdogan, P. M. Belemjian, J.-W. Kim, M. Chu, R. P. Kraft, J. F. McDonald, and K. Bernstein. Mitigating Memory Wall Effects in High-Clock-Rate and Multicore CMOS 3-D Processor Memory Stacks. Proceedings of the IEEE, 97(1):108--122, 2009.
[6]
J. Lawley. White paper: Understanding Performance of PCI Express Systems. Technical report, XILINX, October 2014.
[7]
S. A. McKee. Reflections on the Memory Wall. In Proceedings of the 1st conference on Computing frontiers, page 162. ACM, 2004.
[8]
K. Spafford, J. S. Meredith, and J. S. Vetter. Quantifying NUMA and Contention Effects in Multi-GPU Systems. In Proceedings of the Fourth Workshop on General Purpose Processing on Graphics Processing Units, page 11. ACM, 2011.
[9]
C. Su, D. Li, D. S. Nikolopoulos, M. Grove, K. Cameron, and B. R. De Supinski. Critical Path-Based Thread Placement for NUMA Systems. ACM SIGMETRICS Performance Evaluation Review, 40(2):106--112, 2012.
[10]
The Top500 List of Supercomputers. http://www.top500.org. Accessed: 2016-4-14.

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
ICPE '17: Proceedings of the 8th ACM/SPEC on International Conference on Performance Engineering
April 2017
450 pages
ISBN:9781450344043
DOI:10.1145/3030207
Publication rights licensed to ACM. ACM acknowledges that this contribution was authored or co-authored by an employee, contractor or affiliate of the United States government. As such, the Government retains a nonexclusive, royalty-free right to publish or reproduce this article, or to allow others to do so, for Government purposes only.

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 17 April 2017

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. gpu computing
  2. memory bandwidth
  3. memory latency
  4. numa
  5. parallel architectures
  6. performance analysis

Qualifiers

  • Research-article

Conference

ICPE '17
Sponsor:

Acceptance Rates

ICPE '17 Paper Acceptance Rate 27 of 83 submissions, 33%;
Overall Acceptance Rate 252 of 851 submissions, 30%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)1
  • Downloads (Last 6 weeks)0
Reflects downloads up to 15 Sep 2024

Other Metrics

Citations

Cited By

View all

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media