skip to main content
10.1145/2464996.2465017acmconferencesArticle/Chapter ViewAbstractPublication PagesicsConference Proceedingsconference-collections
research-article

Implementing OmpSs support for regions of data in architectures with multiple address spaces

Published: 10 June 2013 Publication History

Abstract

The need for features for managing complex data accesses in modern programming models has increased due to the emerging hardware architectures. HPC hardware has moved towards clusters of accelerators and/or multicores, architectures with a complex memory hierarchy exposed to the programmer.
We present the implementation of data regions on the OmpSs programming model, a high-productivity annotation-based programming model derived from OpenMP. This enables the programmer to specify regions of strided and/or overlapped data used by the parallel tasks of the application. The data will be automatically managed by the underlying run-time environment, which could transparently apply optimization techniques to improve performance.
This approach based on a high-productivity programming model contrasts with more direct approaches like MPI, where the programmer has to explicitly deal with the data management. It is generally believed that these are capable of achieving the best possible performance, so we also compare the performance of several OmpSs applications against well-known counterparts MPI implementations obtaining comparable or better results.

References

[1]
UPC Language Specifications v1.2, May 2005.
[2]
C. Augonnet, S. Thibault, and R. Namyst. StarPU: a Runtime System for Scheduling Tasks over Accelerator-Based Multicore Machines. Rapport de recherche RR-7240, INRIA, Mar. 2010.
[3]
M. Bauer, J. Clark, E. Schkufza, and A. Aiken. Programming the memory hierarchy revisited: supporting irregular parallelism in sequoia. In Proceedings of the 16th ACM symposium on Principles and practice of parallel programming, PPoPP '11, pages 13--24, New York, NY, USA, 2011. ACM.
[4]
L. S. Blackford, J. Choi, A. Cleary, E. D'Azevedo, J. Demmel, I. Dhillon, J. Dongarra, S. Hammarling, G. Henry, A. Petitet, K. Stanley, D. Walker, and R. C. Whaley. ScaLAPACK Users' Guide. Society for Industrial and Applied Mathematics, Philadelphia, PA, 1997.
[5]
D. Bonachea. Proposal for extending the UPC memory copy library functions and supporting extensions to GASNet. Technical report, October 2004. LBNL 56495.
[6]
D. Bonachea. GASNet Specification, v1.8. Technical report, U.C. Berkeley, 2006.
[7]
J. Bueno, L. Martinell, A. Duran, M. Farreras, X. Martorell, R. Badia, E. Ayguade, and J. Labarta. Productive cluster programming with OmpSs. Euro-Par 2011 Parallel Processing, pages 555--566, 2011.
[8]
J. Bueno, J. Planas, A. Duran, R. Badia, X. Martorell, E. Ayguade, and J. Labarta. Productive Programming of GPU Clusters with OmpSs. In Parallel & Distributed Processing Symposium (IPDPS), 2012 IEEE 26th International, pages 557--568. IEEE, 2012.
[9]
B. Chamberlain, D. Callahan, and H. Zima. Parallel programmability and the chapel language. Int. J. High Perform. Comput. Appl., 21:291--312, August 2007.
[10]
P. Charles, C. Grothoff, V. Saraswat, C. Donawa, A. Kielstra, K. Ebcioglu, C. von Praun, and V. Sarkar. X10: an object-oriented approach to non-uniform cluster computing. In Proceedings of the 20th annual ACM SIGPLAN conference on Object-oriented programming, systems, languages, and applications, OOPSLA '05, pages 519--538, New York, NY, USA, 2005. ACM.
[11]
J. J. Dongarra, I. High, and P. C. Systems. Overview of the HPC Challenge Benchmark Suite.
[12]
A. Duran, E. Ayguadé, R. M. Badia, J. Labarta, L. Martinell, X. Martorell, and J. Planas. OmpSs: A proposal for programming heterogeneous multi-core architectures, 2011-03-01 2011.
[13]
K. Fatahalian, D. R. Horn, T. J. Knight, L. Leem, M. Houston, J. Y. Park, M. Erez, M. Ren, A. Aiken, W. J. Dally, and P. Hanrahan. Sequoia: programming the memory hierarchy. In Proceedings of the 2006 ACM/IEEE conference on Supercomputing, SC '06, New York, NY, USA, 2006.
[14]
T. J. Knight, J. Y. Park, M. Ren, M. Houston, M. Erez, K. Fatahalian, A. Aiken, W. J. Dally, and P. Hanrahan. Compilation for explicitly managed memory hierarchies. In Proceedings of the 2007 ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2007.
[15]
P. Luszczek, J. J. Dongarra, D. Koester, R. Rabenseifner, B. Lucas, J. Kepner, J. Mccalpin, D. Bailey, and D. Takahashi. Introduction to the HPC Challenge Benchmark Suite. Technical report, 2005.
[16]
MPI Forum. MPI: A Message Passing Interface Standard. Intl. Journal of Supercomputer Applications and High Performance Computing, 8(3/4):159--416, 1994.
[17]
OpenMP ARB. OpenMP Application Program Interface, v. 3.0, May 2008.
[18]
J. M. Perez, R. M. Badia, and J. Labarta. Handling task dependencies under strided and aliased references. In Proceedings of the 24th ACM International Conference on Supercomputing, ICS '10, pages 263--274, New York, NY, USA, 2010. ACM.
[19]
D. Unat, X. Cai, and S. B. Baden. Mint: Realizing cuda performance in 3d stencil methods with annotated c. In Proceedings of the 25th ACM International Conference on Supercomputing, ICS '11, pages 214--224. ACM, 2011.
[20]
K. Yelick, L. Semenzato, G. Pike, C. Miyamoto, B. Liblit, A. Krishnamurthy, P. Hilfinger, S. Graham, D. Gay, P. Colella, and A. Aiken. Titanium: A high-performance java dialect. In In ACM, pages 10--11, 1998.

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
ICS '13: Proceedings of the 27th international ACM conference on International conference on supercomputing
June 2013
512 pages
ISBN:9781450321303
DOI:10.1145/2464996
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 10 June 2013

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. cluster programming
  2. non-contiguous memory access
  3. openmp
  4. run-time environments

Qualifiers

  • Research-article

Conference

ICS'13
Sponsor:
ICS'13: International Conference on Supercomputing
June 10 - 14, 2013
Oregon, Eugene, USA

Acceptance Rates

ICS '13 Paper Acceptance Rate 43 of 202 submissions, 21%;
Overall Acceptance Rate 629 of 2,180 submissions, 29%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)25
  • Downloads (Last 6 weeks)0
Reflects downloads up to 29 Jan 2025

Other Metrics

Citations

Cited By

View all

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media