skip to main content
10.1145/2578948.2560687acmconferencesArticle/Chapter ViewAbstractPublication PagesppoppConference Proceedingsconference-collections
tutorial

Palirria: Accurate On-line Parallelism Estimation for Adaptive Work-Stealing

Published: 07 February 2014 Publication History

Abstract

We present Palirria, a self-adapting work-stealing scheduling method for nested fork/join parallelism that can be used to estimate the number of utilizable workers and self-adapt accordingly. The estimation mechanism is optimized for accuracy, minimizing the requested resources without degrading performance. We implemented Palirria for both the Linux and Barrelfish operating systems and evaluated it on two platforms: a 48-core NUMA multiprocessor and a simulated 32-core system. Compared to state-of-the-art, we observed higher accuracy in estimating resource requirements. This leads to improved resource utilization and performance on par or better to executing with fixed resource allotments.

References

[1]
Adapteva Inc. Epiphany Architecture Reference (G3), 2011.
[2]
K. Agrawal, Y. He, W. J. Hsu, and C. Leiserson. Adaptive scheduling with parallelism feedback. In Proceedings of the eleventh ACM SIGPLAN symposium on Principles and practice of parallel programming - PPoPP '06, pages 100--109, New York, New York, USA, 2006. ACM Press.
[3]
K. Agrawal, C. Leiserson, Y. He, and W. J. Hsu. Adaptive work-stealing with parallelism feedback. ACM Transactions on Computer Systems, 26(3):1--32, Sept. 2008.
[4]
N. S. Arora, R. D. Blumofe, and C. G. Plaxton. Thread Scheduling for Multiprogrammed Multiprocessors. Theory of Computing Systems, 34(2):115--144, Jan. 2001.
[5]
G. Banga, P. Druschel, and J. Mogul. Resource containers: A new facility for resource management in server systems. In OSDI '99 Proceedings of the third symposium on Operating systems design and implementation, pages 45--58. USENIX Association, 1998.
[6]
A. Baumann, P. Barham, P. Dagand, T. Harris, R. Isaacs, S. Peter, T. Roscoe, A. Schupbach, and A. Singhania. The multikernel: a new OS architecture for scalable multicore systems. In Proceedings of the ACM SIGOPS 22nd symposium on Operating Systems Principles, pages 29--44. ACM, 2009.
[7]
A. Baumann, S. Peter, A. Schupbach, A. Singhania, T. Roscoe, P. Barham, and R. Isaacs. Your computer is already a distributed system. why isn't your OS? In Proceedings of the 12th conference on Hot topics in operating systems, page 12. USENIX Association, 2009.
[8]
R. Blumofe and C. Leiserson. Scheduling multithreaded computations by work stealing. In 35th Annual Symposium on Foundations of Computer Science, 1994 Proceedings., pages 356--368. IEEE Comput. Soc. Press, 1994.
[9]
Y. Cao, H. Sun, D. Qian, and W. Wu. Stable Adaptive Work-Stealing for Concurrent Multi-core Runtime Systems. In 2011 IEEE International Conference on High Performance Computing and Communications, pages 108--115. Ieee, Sept. 2011.
[10]
Q. Chen, M. Guo, and Z. Huang. CATS: cache aware task-stealing based on online profiling in multi-socket multi-core architectures. In ICS '12 Proceedings of the 26th ACM international conference on Supercomputing, pages 163--172, 2012.
[11]
Q. Chen, Z. Huang, M. Guo, and J. Zhou. CAB: Cache Aware Bi-tier Task-Stealing in Multi-socket Multi-core Architecture. In 2011 International Conference on Parallel Processing, pages 722--732. Ieee, Sept. 2011.
[12]
J. Colmenares, S. Bird, and H. Cook. Resource management in the tessellation manycore os. In HotPar10. USENIX Association, 2010.
[13]
P. C. Diniz and M. C. Rinard. Dynamic feedback: An effective technique for adaptive computing. ACM SIGPLAN Notices, 32(5):71--84, 1997.
[14]
A. Duran, X. Teruel, and R. Ferrer. Barcelona OpenMP tasks suite: A set of benchmarks targeting the exploitation of task parallelism in OpenMP. In ICPP '09 Proceedings of the 2009 International Conference on Parallel Processing, pages 124--131. IEEE Computer Society Press, 2009.
[15]
G. Edjlali, G. Agrawal, A. Sussman, J. Humphries, and J. Saltz. Compiler and Runtime Support for Programming in Adaptive parallel environments. Scientific Programming, 6(2):215--227, 1997.
[16]
K. Faxen and J. Ardelius. Manycore work stealing. In Proceedings of the 8th ACM International Conference on Computing Frontiers, page 1, New York, New York, USA, 2011. ACM Press.
[17]
K.-F. Faxen. Efficient Work Stealing for Fine Grained Parallelism. In 2010 39th International Conference on Parallel Processing, pages 313--322. Ieee, Sept. 2010.
[18]
D. Feitelson. Job scheduling in multiprogrammed parallel systems (extended version). IBM Research Report RC19790 (87657) 2nd Revision, 16(1):104--113, May 1997.
[19]
P. Gschwandtner. Performance analysis and benchmarking of the intel scc. In Proceedings of the 2011 IEEE International Conference on Cluster Computing (Cluster 2011), pages 139--149. Ieee, Sept. 2011.
[20]
M. Hall and J. Anderson. Maximizing multiprocessor performance with the SUIF compiler. Computer, 29(12):84--89, 1996.
[21]
M. W. Hall and M. Martonosi. Adaptive parallelism in compiler-parallelized code. Concurrency: Practice and Experience, 10(14):1235--1250, 1998.
[22]
K. Klues, B. Rhoden, and Y. Zhu. Processes and resource management in a scalable many-core OS. In HotPar'10: Proc. 2nd Workshop on Hot Topics in Parallelism. USENIX Association, 2010.
[23]
M. Kulkarni, M. Burtscher, R. Inkulu, K. Pingali, and C. Casçaval. How much parallelism is there in irregular applications? ACM SIGPLAN Notices, 44(4):3, Feb. 2009.
[24]
Tilera LTD. Tile processor architecturetechnology brief {OL}. httt IWWW. tilera, 2007.
[25]
P. S. Magnusson, M. Christensson, J. Eskilson, D. Forsgren, G. Hå llberg, J. Högberg, F. Larsson, A. Moestedt, and B. Werner. Simics: A Full System Simulation Platform. Computer, 35(2):50--58, Feb. 2002.
[26]
M. McCool, J. Reinders, and A. Robison. Structured Parallel Programming: Patterns for Efficient Computation. Morgan Kaufmann Publishers Inc, 1 edition, 2012.
[27]
D. A. Padua, R. Eigenmann, J. Hoeflinger, P. Petersen, P. Tu, S. Weatherford, and K. Faigin. Polaris: A new-generation parallelizing compiler for MPPs. In In CSRD Rept. No. 1306. Univ. of Illinois at Urbana-Champaign, pages 1--77, 1993.
[28]
A. Podobas and M. Brorsson. A Comparison of some recent Task-based Parallel Programming Models. In Proceedings of the 3rd Workshop on Programmability Issues for Multi-Core Computers, (MULTIPROG'2010), Jan 2010, Pisa, pages 1--14, Pisa, 2010.
[29]
A. Schupbach, S. Peter, A. Baumann, T. Roscoe, P. Barham, T. Harris, and R. Isaacs. Embracing diversity in the Barrelfish manycore operating system. In Proceedings of the Workshop on Managed Many-Core Systems. Citeseer, 2008.
[30]
S. Sen. Dynamic processor allocation for adaptively parallel work-stealing jobs. PhD thesis, Massachusetts Institute of technology, 2004.
[31]
G. Varisteas and M. Brorsson. Automatic Adaptation of Resources to Workload Requirements in Nested Fork-join Programming Models. Technical report, KTH Royal Institute of Technology, 2013.
[32]
M. Voss and R. Eigenmann. ADAPT: Automated de-coupled adaptive program transformation. In International Conference on Parallel Processing, 2000., pages 163--170. IEEE, 2000.
[33]
D. Wentzlaff and A. Agarwal. Factored operating systems (fos). ACM SIGOPS Operating Systems Review, 43(2):76, Apr. 2009.

Cited By

View all
  • (2021)A self‐adjusting task granularity mechanism for the Java lifeline‐based global load balancer library on many‐core clustersConcurrency and Computation: Practice and Experience10.1002/cpe.622434:2Online publication date: 18-Feb-2021
  • (2020)Self-adjusting task granularity for Global load balancer library on clusters of many-core processorsProceedings of the Eleventh International Workshop on Programming Models and Applications for Multicores and Manycores10.1145/3380536.3380539(1-10)Online publication date: 22-Feb-2020

Index Terms

  1. Palirria: Accurate On-line Parallelism Estimation for Adaptive Work-Stealing

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    PMAM'14: Proceedings of Programming Models and Applications on Multicores and Manycores
    February 2014
    156 pages
    ISBN:9781450326575
    DOI:10.1145/2578948
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 07 February 2014

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. adaptive
    2. load balancing
    3. multicore
    4. parallel
    5. resource management
    6. runtime
    7. scheduler
    8. task
    9. work-stealing
    10. workload

    Qualifiers

    • Tutorial
    • Research
    • Refereed limited

    Conference

    PPoPP '14
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 53 of 97 submissions, 55%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)2
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 01 Feb 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2021)A self‐adjusting task granularity mechanism for the Java lifeline‐based global load balancer library on many‐core clustersConcurrency and Computation: Practice and Experience10.1002/cpe.622434:2Online publication date: 18-Feb-2021
    • (2020)Self-adjusting task granularity for Global load balancer library on clusters of many-core processorsProceedings of the Eleventh International Workshop on Programming Models and Applications for Multicores and Manycores10.1145/3380536.3380539(1-10)Online publication date: 22-Feb-2020

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media