skip to main content
10.1145/2795122.2795125acmconferencesArticle/Chapter ViewAbstractPublication PagesiscaConference Proceedingsconference-collections
research-article

Visualization of OpenCL application execution on CPU-GPU systems

Published: 13 June 2015 Publication History

Abstract

Evaluating the performance of parallel and heterogeneous programs and architectures can be challenging. An emulator or simulator can be used to aid the programmer. To provide guidance and feedback to the programmer, the simulator needs to present traces, reports, and debugging information in a coherent and unambiguous format. Although these outputs contain a lot of detailed information relative to the logical and physical transactions about the execution, they are usually extremely large and hard to analyze. What is needed is an interface into the simulator that can help programmers and architects shift through this myriad of data. In this contribution, we describe the M2S-Visual trace-driven visualization tool, a complementary addition to Multi2sim (M2S) heterogeneous system simulator. M2S-Visual provides a graphical representation of parallel program execution on the simulator. M2S is an established simulator, designed with an emphasis on simulating the execution of parallel applications on graphics processing units, and provides a number of instrumentation capabilities that enable research in architecture exploration and application characterization. This visualization framework, added to Multi2sim, aims to complement (and potentially replace) text-based statistical profiling, enabling the user to better learn and understand each software transaction executed on the simulated hardware. While M2S supports emulation of both OpenCL and CUDA programs, our visualization framework presently only supports OpenCL execution. M2S supports execution on both CPUs (X86, ARM and MIPS) and GPUs (AMD Evergreen and Southern Islands, and NVIDIA Fermi and Kepler), but presently only supports detailed visualization on a multicore X86 CPU and AMD Evergreen and Southern Islands GPUs. Besides supporting OpenCL programming and debugging, an additional goal is to deliver a reliable product for teaching the details of parallel programming execution on heterogeneous systems. Given the move to many-core architectures in the industry, this toolset is timely and addresses a growing gap in our educational infrastructure. The tool is also designed to support the research community, providing analysis of performance bottlenecks of OpenCL programs. We also incorporated the option to produce visualization graphs which provide deeper insight into application performance and hardware resource utilization.

References

[1]
Nvidia gpu occupancy calculator. ttp://developer.download.nvidia.com/.
[2]
AMD Graphics Cores Next (GCN) Architecture, June 2012. White paper.
[3]
G. Adams. Dlxview--(preliminary) user manual. http://yara.ecn.purdue.edu/teamaaa/dlxview.
[4]
N. Agarwal, T. Krishna, L.-S. Peh, and N. K. Jha. Garnet: A detailed on-chip network model inside a full-system simulator. In Performance Analysis of Systems and Software, 2009. ISPASS 2009. IEEE International Symposium on, pages 33--42. IEEE, 2009.
[5]
AMD. AMD Accelerated Parallel Processing OpenCL Programming Guide. http://developer.amd.com/GPU/AMDAPPSDK/, Jan. 2011.
[6]
A. Bakhoda, G. L. Yuan, W. W. Fung, H. Wong, and T. M. Aamodt. Analyzing cuda workloads using a detailed gpu simulator. In IEEE International Symposium on Performance Analysis of Systems and Software, 2009., pages 163--174. IEEE, 2009.
[7]
D. Burger and T. M. Austin. The simplescalar tool set, version 2.0. ACM SIGARCH Computer Architecture News, 25(3):13--25, 1997.
[8]
M. I. Garcia, S. Rodríguez, A. Pérez, and A. García. p88110: A graphical simulator for computer architecture and organization courses. Education, IEEE Transactions on, 52(2):248--256, 2009.
[9]
A. B. Kahng, B. Li, L.-S. Peh, and K. Samadi. Orion 2.0: A power-area simulator for interconnection networks. Institute of Electrical and Electronics Engineers, 2011.
[10]
M. Marty, B. Beckmann, L. Yen, A. Alameldeen, M. Xu, and K. Moore. Gems: Multifacet's general execution-driven multiprocessor simulator. In International Symposium on Computer Architecture, 2006.
[11]
M. Mohiyuddin. Tuning hardware and software for multiprocessors. PhD thesis, University of California, Berkeley, 2012.
[12]
M. Á. V. Rodriguez, J. M. S. Pérez, and J. A. G. Pulido. An educational tool for testing caches on symmetric multiprocessors. Microprocessors and Microsystems, 25(4):187--194, 2001.
[13]
M. Schulz, B. S. White, S. A. McKee, H.-H. S. Lee, and J. Jeitner. Owl: next generation system monitoring. In Proceedings of the 2nd conference on Computing frontiers, pages 116--124. ACM, 2005.
[14]
D. Skrien. Cpu sim 3.1: A tool for simulating computer architectures for computer organization classes. Journal on Educational Resources in Computing (JERIC), 1(4):46--59, 2001.
[15]
G. Team. The gtk+ project.". http://www.gtk.org.
[16]
I. Tollis, P. Eades, G. Di Battista, and L. Tollis. Graph drawing: algorithms for the visualization of graphs, volume 1. Prentice Hall New York, 1998.
[17]
R. Ubal, B. Jang, P. Mistry, D. Schaa, and D. Kaeli. Multi2sim: A simulation framework for cpu-gpu computing. In Proceedings of the 21st international conference on Parallel architectures and compilation techniques, pages 335--344. ACM, 2012.
[18]
R. Ubal, J. Sahuquillo, S. Petit, P. Lopez, Z. Chen, and D. R. Kaeli. The multi2sim simulation framework: A cpu-gpu model for heterogeneous computing. https://www.multi2sim.org.
[19]
D. Uluski, M. Moffie, and D. Kaeli. Characterizing antivirus workload execution. ACM SIGARCH Computer Architecture News, 33(1):90--98, 2005.
[20]
M. Wilkening, V. Sridharan, S. Li, F. Previlon, S. Gurumurthi, and D. R. Kaeli. Calculating architectural vulnerability factors for spatial multi-bit transient faults. In Microarchitecture (MICRO), 2014 47th Annual IEEE/ACM International Symposium on, pages 293--305. IEEE, 2014.
[21]
S. C. Woo, M. Ohara, E. Torrie, J. P. Singh, and A. Gupta. The splash-2 programs: Characterization and methodological considerations. In ACM SIGARCH Computer Architecture News, volume 23, pages 24--36. ACM, 1995.
[22]
Y. Zhang and G. B. Adams III. An interactive, visual simulator for the dlx pipeline. In Proceedings of the 1997 workshop on Computer architecture education, page 2. ACM, 1997.
[23]
A. K. Ziabari, J. L. Abéllan, R. Ubal, C. Chen, A. Joshi, and D. Kaeli. Leveraging silicon-photonic noc for designing scalable gpus. In 29th International Conference on Supercomputing. ACM, 2015.

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
WCAE '15: Proceedings of the Workshop on Computer Architecture Education
June 2015
64 pages
ISBN:9781450337175
DOI:10.1145/2795122
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 13 June 2015

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. cycle-based simulation
  2. heterogeneous systems
  3. trace-driven visualization

Qualifiers

  • Research-article

Conference

ISCA '15
Sponsor:

Acceptance Rates

WCAE '15 Paper Acceptance Rate 9 of 10 submissions, 90%;
Overall Acceptance Rate 9 of 10 submissions, 90%

Upcoming Conference

ISCA '25

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)10
  • Downloads (Last 6 weeks)1
Reflects downloads up to 01 Jan 2025

Other Metrics

Citations

Cited By

View all

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media