research-article

Caliper: performance introspection for HPC software stacks

Authors:

David Beckingsale,

Peer-Timo Bremer,

Alfredo Gimenez,

Matthew LeGendre,

Martin SchulzAuthors Info & Claims

SC '16: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis

Article No.: 47, Pages 1 - 11

Published: 13 November 2016 Publication History

Abstract

Many performance engineering tasks, from long-term performance monitoring to post-mortem analysis and online tuning, require efficient runtime methods for introspection and performance data collection. To understand interactions between components in increasingly modular HPC software, performance introspection hooks must be integrated into runtime systems, libraries, and application codes across the software stack. This requires an interoperable, cross-stack, general-purpose approach to performance data collection, which neither application-specific performance measurement nor traditional profile or trace analysis tools provide. With Caliper, we have developed a general abstraction layer to provide performance data collection as a service to applications, runtime systems, libraries, and tools. Individual software components connect to Caliper in independent data producer, data consumer, and measurement control roles, which allows them to share performance data across software stack boundaries. We demonstrate Caliper's performance analysis capbilities with two case studies of production scenarios.

References

[1]

I. Karlin, A. Bhatele, B. L. Chamberlain, J. Cohen, Z. Devito, M. Gokhale, R. Haque, R. Hornung, J. Keasler, D. Laney, E. Luke, S. Lloyd, J. McGraw, R. Neely, D. Richards, M. Schulz, C. H. Still, F. Wang, and D. Wong, "Lulesh programming model and performance ports overview," Tech. Rep. LLNL-TR-608824, December 2012.

[2]

R. D. Hornung and J. A. Keasler, "The RAJA Poratability Layer: Overview and Status," Lawrence Livermore National Laboratory, Tech. Rep. LLNL-TR-661403, Sep. 2014.

[3]

B. T. N. Gunney, A. M. Wissink, and D. A. Hysom, "Parallel Clustering Algorithms for Structured AMR," Journal of Parallel and Distributed Computing, vol. 66, no. 11, pp. 1419--1430, 2006.

Digital Library

[4]

R. Falgout, J. Jones, and U. Yang, "The Design and Implementation of HYPRE, a Library of Parallel High Performance Preconditioners," Chapter in Numerical Solution of Partial Differential Equations on Parallel Computers, A.M. Bruaset and A. Tveito, eds., vol. 51, no. 4, pp. 267--294, 2006.

[5]

D. A. Beckingsale, W. Gaudin, A. Herdman, and S. Jarvis, "Resident Block-Structured Adaptive Mesh Refinement on Thousands of Graphics Processing Units," in Proceedings of the 44th International Conference on Parallel Processing. IEEE, Aug. 2015, pp. 61--70.

Digital Library

[6]

A. E. Eichenberger, J. M. Mellor-Crummey, M. Schulz, M. Wong, N. Copty, J. DelSignore, R. Dietrich, X. Liu, E. Loh, and D. Lorenz, "OMPT: OpenMP tools application programming interfaces for performance analysis," in Proc. of the 9th International Workshop on OpenMP (IWOMP), Canberra, Australia, ser. LNCS, no. 8122. Berlin / Heidelberg: Springer, 2013, pp. 171--185.

[7]

P. J. Mucci, S. Browne, C. Deane, and G. Ho, "PAPI: A portable interface to hardware performance counters," in Proc. Department of Defense HPCMP User Group Conference, Jun. 1999.

[8]

Knüpfer, Andreas and Rössel, Christian and Mey, Dieteran and Biersdorff, Scott and Diethelm, Kai and Eschweiler, Dominic and Geimer, Markus and Gerndt, Michael and Lorenz, Daniel and Malony, Allen and Nagel, Wolfgang E. and Oleynik, Yury and Philippen, Peter and Saviankou, Pavel and Schmidl, Dirk and Shende, Sameer and Tschüter, Ronny and Wagner, Michael and Wesarg, Bert and Wolf, Felix, "Score-P: A joint performance measurement run-time infrastructure for Periscope, Scalasca, TAU, and Vampir," in Tools for High Performance Computing 2011, Brunst, Holger and Müller, Matthias S. and Nagel, Wolfgang E. and Resch, Michael M., Ed. Springer Berlin Heidelberg, 2011, pp. 79--91.

[9]

S. Shende and A. D. Malony, "The tau parallel performance system," International Journal of High Performance Computing Applications, vol. 20, no. 2, pp. 287--311, 2006.

Digital Library

[10]

L. Adhianto, S. Banerjee, M. Fagan, M. Krentel, G. Marin, J. Mellor-Crummey, and N. R. Tallent, "Hpctoolkit: Tools for performance analysis of optimized parallel programs," Concurrency and Computation: Practice and Experience, vol. 22, no. 6, pp. 685--701, 2010.

Digital Library

[11]

M. Schulz, J. Galarowicz, D. Maghrak, W. Hachfeld, D. Montoya, and S. Cranford, "Open|speedshop: An open source infrastructure for parallel performance analysis," Scientific Programming, vol. 16, no. 2--3, pp. 105--121, 2008.

Digital Library

[12]

M. Geimer, F. Wolf, B. J. N. Wylie, E. Ábrahám, D. Becker, and B. Mohr, "The Scalasca performance toolset architecture," Concurrency and Computation: Practice and Experience, vol. 22, no. 6, pp. 702--719, Apr. 2010. {Online}. Available: http://apps.fz-juelich.de/jsc-pubsystem/pub-webpages/general/get_attach.php?pubid=142

Digital Library

[13]

J. Mellor-Crummey, R. Fowler, and G. Marin, "HPCView: A tool for top-down analysis of node performance," The Journal of Supercomputing, vol. 23, pp. 81--101, 2002.

Digital Library

[14]

K. A. Huck and A. D. Malony, "Perfexplorer: A performance data mining framework for large-scale parallel computing," in Proceedings of the 2005 ACM/IEEE Conference on Supercomputing, ser. SC '05. Washington, DC, USA: IEEE Computer Society, 2005, pp. 41--. {Online}. Available: http://dx.doi.org/10.1109/SC.2005.55

Digital Library

[15]

W. E. Nagel, A. Arnold, M. Weber, H. C. Hoppe, and K. Solchenbach, "VAMPIR: Visualization and analysis of MPI resources," Supercomputer, vol. 12, no. 1, pp. 69--80, 1996.

[16]

L. D. Erman, F. Hayes-Roth, V. R. Lesser, and D. R. Reddy, "The hearsay-ii speech-understanding system: Integrating knowledge to resolve uncertainty," ACM Computing Surveys (CSUR), vol. 12, no. 2, pp. 213--253, 1980.

Digital Library

[17]

H. P. Nii, "Blackboard application systems, blackboard systems and a knowledge engineering perspective," AI magazine, vol. 7, no. 3, p. 82, 1986.

Digital Library

[18]

D. D. Corkill, "Blackboard systems," AI expert, vol. 6, no. 9, pp. 40--47, 1991.

[19]

K. Huck, A. Porterfield, N. Chaimov, H. Kaiser, A. D. Malony, T. Sterling, and R. Fowler, "An Autonomic Performance Environment for Exascale," Supercomputing Frontiers and Innovations, vol. 2, no. 3, 2015.

[20]

A. Mandal, R. Fowler, and A. Porterfield, "System-wide introspection for accurate attribution of performance bottlenecks," in Workshop on High-performance Infrastructure for Scalable Tools (WHIST), Venice, Italy, 06/2012 2012.

[21]

K. Varda, "Google's data interchange format," Online, July 7 2008, https://developers.google.com/protocol-buffers/.

Cited By

Malony ARamesh SHuck KChaimov NShende S(2019)A Plugin Architecture for the TAU Performance SystemProceedings of the 48th International Conference on Parallel Processing10.1145/3337821.3337916(1-11)Online publication date: 5-Aug-2019
https://dl.acm.org/doi/10.1145/3337821.3337916
Wang TJain NBeckingsale DBoehme DMueller FGamblin T(2019)FuncyTunerProceedings of the 48th International Conference on Parallel Processing10.1145/3337821.3337842(1-10)Online publication date: 5-Aug-2019
https://dl.acm.org/doi/10.1145/3337821.3337842
Giménez AGamblin TBhatele AWood CShoga KMarathe ABremer PHamann BSchulz MMohr BRaghavan P(2017)ScrubJayProceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis10.1145/3126908.3126935(1-12)Online publication date: 12-Nov-2017
https://dl.acm.org/doi/10.1145/3126908.3126935
Show More Cited By

Recommendations

Measurement and analysis of GPU-accelerated applications with HPCToolkit
Abstract
To address the challenge of performance analysis on the US DOE’s forthcoming exascale supercomputers, Rice University has been extending its HPCToolkit performance tools to support measurement and analysis of GPU-accelerated ...
Highlights
- HPCToolkit supports performance measurement and analysis on Intel, AMD, and NVIDIA GPUs.
Refining HPCToolkit for application performance analysis at exascale

As part of the US Department of Energy’s Exascale Computing Project (ECP), Rice University has been refining its HPCToolkit performance tools to better support measurement and analysis of applications executing on exascale supercomputers. To efficiently ...
Assessing the potentials of CASE-tools in software process improvement: a benchmarking study
SAST '96: Proceedings of the Proceedings of the Fourth International Symposium on Assessment of Software Tools (SAST '96)

CASE tools have been thought as one of the most important means for implementing the derived quality programs. Two basic questions should be answered to find the right CASE tool: what attributes the CASE tools should exhibit and how the existing tools ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

SC '16: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis

November 2016

1034 pages

ISBN:9781467388153

Conference Chair:
John West
University of Texas at Austin

Sponsors

SIGARCH: ACM Special Interest Group on Computer Architecture
IEEE-CS: Computer Society

In-Cooperation

SIGHPC: ACM Special Interest Group on High Performance Computing, Special Interest Group on High Performance Computing

Publisher

IEEE Press

Publication History

Published: 13 November 2016

Check for updates

Author Tags

Qualifiers

Research-article

Conference

SC16

Sponsor:

SIGARCH
IEEE-CS

SC16: The International Conference for High Performance Computing, Networking, Storage and Analysis

November 13 - 18, 2016

Utah, Salt Lake City

Acceptance Rates

SC '16 Paper Acceptance Rate 81 of 442 submissions, 18%;

Overall Acceptance Rate 1,516 of 6,373 submissions, 24%

Upcoming Conference

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

5
Total Citations
View Citations
243
Total Downloads

Downloads (Last 12 months)4
Downloads (Last 6 weeks)0

Reflects downloads up to 01 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

Malony ARamesh SHuck KChaimov NShende S(2019)A Plugin Architecture for the TAU Performance SystemProceedings of the 48th International Conference on Parallel Processing10.1145/3337821.3337916(1-11)Online publication date: 5-Aug-2019
https://dl.acm.org/doi/10.1145/3337821.3337916
Wang TJain NBeckingsale DBoehme DMueller FGamblin T(2019)FuncyTunerProceedings of the 48th International Conference on Parallel Processing10.1145/3337821.3337842(1-10)Online publication date: 5-Aug-2019
https://dl.acm.org/doi/10.1145/3337821.3337842
Giménez AGamblin TBhatele AWood CShoga KMarathe ABremer PHamann BSchulz MMohr BRaghavan P(2017)ScrubJayProceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis10.1145/3126908.3126935(1-12)Online publication date: 12-Nov-2017
https://dl.acm.org/doi/10.1145/3126908.3126935
Ngai WHegeman THeldens SIosup A(2017)GranulaProceedings of the Fifth International Workshop on Graph Data-management Experiences & Systems10.1145/3078447.3078455(1-6)Online publication date: 19-May-2017
https://dl.acm.org/doi/10.1145/3078447.3078455
Pearce OAhmed HLarsen RRichards D(2016)Enabling work migration in CoMD to study dynamic load imbalance solutionsProceedings of the 7th International Workshop on Performance Modeling, Benchmarking and Simulation of High Performance Computing Systems10.5555/3019057.3019067(98-107)Online publication date: 13-Nov-2016
https://dl.acm.org/doi/10.5555/3019057.3019067

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents