skip to main content
10.1145/3337821.3337916acmotherconferencesArticle/Chapter ViewAbstractPublication PagesicppConference Proceedingsconference-collections
research-article
Public Access

A Plugin Architecture for the TAU Performance System

Published: 05 August 2019 Publication History

Abstract

Several robust performance systems have been created for parallel machines with the ability to observe diverse aspects of application execution on different hardware platforms. All of these are designed with the objective to support measurement methods that are efficient, portable, and scalable. For these reasons, the performance measurement infrastructure is tightly embedded with the application code and runtime execution environment. As parallel software and systems evolve, especially towards more heterogeneous, asynchronous, and dynamic operation, it is expected that the requirements for performance observation and awareness will change. For instance, heterogeneous machines introduce new types of performance data to capture and performance behaviors to characterize. Furthermore, there is a growing interest in interacting with the performance infrastructure for in situ analytics and policy-based control. The problem is that an existing performance system architecture could be constrained in its ability to evolve to meet these new requirements. The paper reports our research efforts to address this concern in the context of the TAU Performance System. In particular, we consider the use of a powerful plugin model to both capture existing capabilities in TAU and to extend its functionality in ways it was not necessarily conceived originally. The TAU plugin architecture supports three types of plugin paradigms: EVENT, TRIGGER, and AGENT. We demonstrate how each operates under several different scenarios. Results from larger-scale experiments are shown to highlight the fact that efficiency and robustness can be maintained, while new flexibility and programmability can be offered that leverages the power of the core TAU system while allowing significant and compelling extensions to be realized.

References

[1]
S. Shende and A. Malony. The tau parallel performance system. International Journal of High Performance Computing Applications, 20(2):287--311, 2006.
[2]
A. Morris, A. Malony, S. Shende, and K. Huck. Design and Implementation of a Hybrid Parallel Performance Measurement System. September 2010.
[3]
P. Mucci, S. Browne, C. Deane, and G. Ho. PAPI: A Portable Interface to Hardware Performance Counters. In DoD HPCMP Users Group Conference, pages 7--10, 1999.
[4]
A. Knüpfer, R. Brendel, H. Brunst, H. Mix, and W. E. Nagel. Introducing the Open Trace Format (OTF). In Proceedings of the 6th International Conference on Computational Science, volume 3992 of Springer Lecture Notes in Computer Science, pages 526--533, Reading, UK, May 2006.
[5]
K. Huck, A. Malony, R. Bell, and A. Morris. Design and Implementation of a Parallel Performance Data Management Framework. In 34th International Conference on Parallel Processing (ICPP). IEEE Computer Society, August 2005.
[6]
R. Bell, A. Malony, and S. Shende. A Portable, Extensible, and Scalable Tool for Parallel Performance Profile Analysis. In European Conference on Parallel Processing (EuroPar), volume LNCS 2790, pages 17--26, September 2003.
[7]
K. Huck and A. Malony. PerfExplorer: A Performance Data Mining Framework for Large-Scale Parallel Computing. In ACM/IEEE Conference on Supercomputing (SC). ACM, November 2005.
[8]
Andreas Knüpfer, Holger Brunst, Jens Doleschal, Matthias Jurenz, Matthias Lieber, Holger Mickler, Matthias S Müller, and Wolfgang E Nagel. The vampir performance analysis tool-set. In Tools for High Performance Computing, pages 139--155. Springer, 2008. www.vampir.eu.
[9]
C. Eric Wu, Anthony Bolmarcich, Marc Snir, David Wootton, Farid Parpia, Anthony Chan, Ewing Lusk, and William Gropp. From trace generation to visualization: A performance framework for distributed parallel systems. In Proc. of SC2000: High Performance Networking and Computing, November 2000.
[10]
F. Song, F. Wolf, N. Bhatia, J. Dongarra, and S. Moore. An Algebra for Cross-Experiment Performance Analysis. In Proc. of International Conference on Parallel Processing, ICPP-04, August 2004.
[11]
F. Wolf, B. Mohr, J. Dongarra, and S. Moore. Efficient Pattern Search in Large Traces through Successive Refinement. In Proceedings of the European Conference on Parallel Computing (EuroPar 2004, LNCS 3149), pages 47--54. Springer, 2004.
[12]
et al. A. Knüpfer. Score-p: A joint performance measurement run-time infrastructure for periscope, scalasca, tau, and vampir. In 6th International Workshop on Parallel Tools for High Performance Computing, pages 79--91, 2012.
[13]
et al. L. Adhianto. Hpctoolkit: Tools for performance analysis of optimized parallel programs. Concurrency and Computation: Practice and Experience, 22(6):685--701, April 2010.
[14]
et al. M. Geimer. The scalasca performance toolset architecture. Concurrency and Computation: Practice and Experience, 22(6):702--719, April 2010.
[15]
et. al P. Alonso. Tools for Power-energy Modelling and Analysis of Parallel Scientific Applications. In 41st International Conference on Parallel Processing (ICPP), pages 420--429, 2012.
[16]
et al. M. Schulz. Open |SpeedShop: An Open Source Infrastructure for Parallel Performance Analysis. Scientific Programming, 16(2-3), April 2008.
[17]
et. al S. Ramesh. MPI Performance Engineering with the MPI Tool Interface: Integration of MVAPICH and TAU. In 24th European MPI Users' Group Meeting (EuroMPI), 2017.
[18]
Ian Karlin, Jeff Keasler, and JR Neely. Lulesh 2.0 updates and changes. Technical report, Lawrence Livermore National Lab.(LLNL), Livermore, CA (United States), 2013.
[19]
Andreas Knüpfer, Holger Brunst, Jens Doleschal, Matthias Jurenz, Matthias Lieber, Holger Mickler, Matthias S Müller, and Wolfgang E Nagel. The vampir performance analysis tool-set. In Tools for High Performance Computing, pages 139--155. Springer, 2008.
[20]
Chad Wood, Sudhanshu Sane, Daniel Ellsworth, Alfredo Gimenez, Kevin Huck, Todd Gamblin, and Allen Malony. A scalable observation system for introspection and in situ analytics. In 2016 5th Workshop on Extreme-Scale Programming Tools (ESPT), pages 42--49. IEEE, 2016.
[21]
M. Schulz and B. de Supinski. PNMPI Tools: A Whole Lot Greater than the Sum of Their Parts. In ACM/IEEE Conference on Supercomputing (SC), pages 30:1--30:10, 2007.
[22]
et al. A. Eichenberger. Ompt: Openmp tools application programming interfaces for performance analysis. In International Workshop on OpenMP (IWOMP), pages 171--185, 2017.
[23]
et. al A. Malony. Parallel Performance Measurement of Heterogeneous Parallel Systems with GPUs. In 40th International Conference on Parallel Processing (ICPP), pages 176--185, 2011.
[24]
Philip J Mucci, Shirley Browne, Christine Deane, and George Ho. Papi: A portable interface to hardware performance counters. In Proceedings of the department of defense HPCMP users group conference, volume 710, 1999.
[25]
Anirban Mandal, Rob Fowler, and Allan Porterfield. System-wide introspection for accurate attribution of performance bottlenecks. In Second International Workshop on High-perfromance Infrastruture for Scalable Tools, 2012.
[26]
Kevin A Huck, Allan Porterfield, Nick Chaimov, Hartmut Kaiser, Allen D Malony, Thomas Sterling, and Rob Fowler. An autonomic performance environment for exascale. Supercomputing frontiers and innovations, 2(3):49--66, 2015.
[27]
Barton P Miller, Mark D. Callaghan, Jonathan M Cargille, Jeffrey K Hollingsworth, R Bruce Irvin, Karen L Karavanic, Krishna Kunchithapadam, and Tia Newhall. The paradyn parallel performance measurement tool. Computer, 28(11):37--46, 1995.
[28]
Shajulin Benedict, Ventsislav Petkov, and Michael Gerndt. Periscope: An online-based distributed performance analysis tool. In Tools for High Performance Computing 2009, pages 1--16. Springer, 2010.
[29]
Renato Miceli, Gilles Civario, Anna Sikora, Eduardo César, Michael Gerndt, Houssam Haitof, Carmen Navarrete, Siegfried Benkner, Martin Sandrieser, Laurent Morin, et al. Autotune: A plugin-driven approach to the automatic tuning of parallel applications. In International Workshop on Applied Parallel Computing, pages 328--342. Springer, 2012.
[30]
et al. R. Schöne. Extending the Functionality of Score-P through Plugins: Interfaces and Use Cases. In 10th International Workshop on Parallel Tools for High Performance Computing, page 59--82, October 2016.
[31]
David Böhme, Todd Gamblin, David Beckingsale, Peer-Timo Bremer, Alfredo Gimenez, Matthew LeGendre, Olga Pearce, and Martin Schulz. Caliper: Performance introspection for hpc software stacks. In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, SC '16, pages 47:1--47:11, Piscataway, NJ, USA, 2016. IEEE Press.
[32]
A. Danalis, H. Jagode, T. Herault, P. Luszczek, and Jack Dongarra. Software-defined Events through PAPI. 2019.

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences
ICPP '19: Proceedings of the 48th International Conference on Parallel Processing
August 2019
1107 pages
ISBN:9781450362955
DOI:10.1145/3337821
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

In-Cooperation

  • University of Tsukuba: University of Tsukuba

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 05 August 2019

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. measurement
  2. parallel performance
  3. plugin
  4. runtime analytics

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Funding Sources

Conference

ICPP 2019

Acceptance Rates

Overall Acceptance Rate 91 of 313 submissions, 29%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)98
  • Downloads (Last 6 weeks)13
Reflects downloads up to 30 Dec 2024

Other Metrics

Citations

Cited By

View all

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media