skip to main content
10.5555/1789826.1789845guideproceedingsArticle/Chapter ViewAbstractPublication PagesConference Proceedingsacm-pubtype
Article

Scheduling dynamic OpenMP applications over multicore architectures

Published: 12 May 2008 Publication History

Abstract

Approaching the theoretical performance of hierarchical multicoremachines requires a very careful distribution of threads and dataamong the underlying non-uniform architecture in order to minimizecache misses and NUMA penalties. While it is acknowledged thatOpenMP can enhance the quality of thread scheduling on such architecturesin a portable way, by transmitting precious information aboutthe affinities between threads and data to the underlying runtime system,most OpenMP runtime systems are actually unable to efficiently supporthighly irregular, massively parallel applications on NUMA machines.
In this paper, we present a thread scheduling policy suited to theexecution of OpenMP programs featuring irregular and massive nestedparallelism over hierarchical architectures. Our policy enforces a distributionof threads that maximizes the proximity of threads belonging tothe same parallel region, and uses a NUMA-aware work stealing strategywhen load balancing is needed. It has been developed as a plug-in tothe forestGOMP OpenMP platform [TBG+07]. We demonstrate theefficiency of our approach with a highly irregular recursive OpenMP programresulting from the generic parallelization of a surface reconstructionapplication. We achieve a speedup of 14 on a 16-core machine with noapplication-level optimization.

References

[1]
Ayguade, E., Copty, N., Duranl, A., Hoeflinger, J., Lin, Y., Massaioli, F.,Su, E., Unnikrishnan, P., Zhang, G.: A proposal for task parallelism inOpenMP. In: Third International Workshop on OpenMP (IWOMP 2007),Beijing, China (2007).
[2]
Ayguade, E., Gonzalez, M., Martorell, X., Jost, G.: Employing NestedOpenMP for the Parallelization of Multi-Zone Computational Fluid DynamicsApplications. In: 18th International Parallel and Distributed ProcessingSymposium (IPDPS) (2004).
[3]
an Mey, D., Sarholz, S., Terboven, C.: Nested Parallelization withOpenMP. Parallel Computing 35(5), 459-476 (2007).
[4]
Balart, J., Duran, A., Gonzàlez, M., Martorell, X., Ayguadé, E., Labarta,J.: Nanos mercurium: A research compiler for openmp. In: European Workshopon OpenMP (EWOMP) (October 2004).
[5]
Blikberg, R., Sørevik, T.: Load balancing and OpenMP implementation ofnested parallelism. Parallel Computing, 31(10-12):984-998 (October 2005).
[6]
Chapman, B.M., Huang, L., Jin, H., Jost, G., de Supinski, B.R.: Extendingopenmp worksharing directives for multithreading. In: EuroPar 2006Parallel Processing (2006).
[7]
Duran, A., Gonzàles, M., Corbalán, J.: Automatic Thread Distribution forNested Parallelism in OpenMP. In: 19th ACM International Conference onSupercomputing, Cambridge, MA, USA, June 2005, pp. 121-130 (2005).
[8]
Duran, A., Silvera, R., Corbalán, J., Labarta, J.: Runtime adjustment ofparallel nested loops. In: Chapman, B.M. (ed.) WOMPAT 2004. LNCS,vol. 3349, Springer, Heidelberg (2005).
[9]
Frigo, M., Leiserson, C.E., Randall, K.H.: The Implementation of the Cilk-5 Multithreaded Language. In: ACM SIGPLAN Conference on ProgrammingLanguage Design and Implementation (PLDI), Montreal, Canada (June 1998).
[10]
GOMP - An OpenMP implementation for GCC,http://gcc.gnu.org/projects/gomp/
[11]
Gonzalez, M., Oliver, J., Martorell, X., Ayguade, E., Labarta, J., Navarro,N.: OpenMP Extensions for Thread Groups and Their Run-Time Support.In: Languages and Compilers for Parallel Computing, Springer, Heidelberg(2001).
[12]
Gao, G.R., Sterling, T., Stevens, R., Hereld, M., Zhu, W.: Hierarchicalmultithreading: programming model and system software. In: 20th InternationalParallel and Distributed Processing Symposium (IPDPS) (April 2006).
[13]
Gerndt, A., Sarholz, S., Wolter, M., an Mey, D., Bischof, C., Kuhlen, T.:Nested OpenMP for Efficient Computation of 3D Critical Points in Multi-Block CFD Datasets. In: Super Computing (November 2006).
[14]
Hadjidoukas, P.E., Dimakopoulos, V.V.: Nested Parallelism in the OMPiOpenMP/C compiler. In: EuroPar, Rennes,France, July 2007, ACM, NewYork (2007).
[15]
Karlsson, S.: An Introduction to Balder - An OpenMP Run-time Libraryfor Clusters of SMPs. In: International Workshop on OpenMP (IWOMP)(June 2005).
[16]
Martorell, X., Ayguadé, E., Navarro, N., Corbalán, J., González, M.,Labarta, J.: Thread Fork/Join Techniques for Multi-Level Parallelism Exploitationin NUMA Multiprocessors. In: International Conference on SuperComputing,pp. 294-301. ACM Press, New York (1999).
[17]
Ohtake, Y., Belyaev, A., Alexa, M., Turk, G., Seidel, H.-P.: Multi-level partition of unity implicits. ACM Trans. Graph. 22(3), 463-470 (2003).
[18]
Su, E., Tian, X., Haab, M.G.G., Shah, S., Petersen, P.: Compiler Supportof the Workqueuing Execution Model for Intel SMP Architectures. In:European Workshop on OpenMP (EWOMP) (October 2004).
[19]
Thibault, S., Broquedis, F., Goglin, B., Namyst, R., Wacrenier, P.-A.: AnEfficient OpenMP Runtime System for Hierarchical Architectures. In: InternationalWorkshop on OpenMP (IWOMP), Beijing,China, June 2007,pp. 148-159 (2007).
[20]
Tian, X., Girkar, M., Bik, A., Saito, H.: Practical Compiler Techniques onEfficient Multithreaded Code Generation for OpenMP Programs. Comput.J. 48(5), 588-601 (2005).
[21]
Tian, X., Girkar, M., Shah, S., Armstrong, D., Su, E., Petersen, P.: Compilerand Runtime Support for Running OpenMP Programs on PentiumandItanium-Architectures. In: Eighth International Workshop on High-Level Parallel Programming Models and Supportive Environments, April2003, pp. 47-55 (2003).
[22]
Tian, X., Hoeflinger, J.P., Haab, G., Chen, Y.-K., Girkar, M., Shah, S.: Acompiler for exploiting nested parallelism in OpenMP programs. ParallelComput. 31(10-12), 960-983 (2005).
[23]
Tanaka, Y., Taura, K., Sato, M., Yonezawa, A.: Performance evaluationof openmp applications with nested parallelism. In: Languages, Compilers,and Run-Time Systems for Scalable Computers, pp. 100-112 (2000).

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image Guide Proceedings
IWOMP'08: Proceedings of the 4th international conference on OpenMP in a new era of parallelism
May 2008
191 pages
ISBN:354079560X
  • Editors:
  • Rudolf Eigenmann,
  • Bronis R. De Supinski

Publisher

Springer-Verlag

Berlin, Heidelberg

Publication History

Published: 12 May 2008

Author Tags

  1. NUMA
  2. SMP
  3. bubbles
  4. hierarchical thread scheduling
  5. multi-core
  6. nested parallelism
  7. openMP

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 05 Feb 2025

Other Metrics

Citations

Cited By

View all

View Options

View options

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media