skip to main content
10.1145/2503210.2503233acmconferencesArticle/Chapter ViewAbstractPublication PagesscConference Proceedingsconference-collections
research-article

Deterministic scale-free pipeline parallelism with hyperqueues

Published: 17 November 2013 Publication History

Abstract

Ubiquitous parallel computing aims to make parallel programming accessible to a wide variety of programming areas using deterministic and scale-free programming models built on a task abstraction. However, it remains hard to reconcile these attributes with pipeline parallelism, where the number of pipeline stages is typically hard-coded in the program and defines the degree of parallelism.
This paper introduces hyperqueues, a programming abstraction that enables the construction of deterministic and scale-free pipeline parallel programs. Hyperqueues extend the concept of Cilk++ hyperobjects to provide thread-local views on a shared data structure. While hyperobjects are organized around private local views, hyperqueues require shared concurrent views on the underlying data structure. We define the semantics of hyperqueues and describe their implementation in a work-stealing scheduler. We demonstrate scalable performance on pipeline-parallel PARSEC benchmarks and find that hyperqueues provide comparable or up to 30% better performance than POSIX threads and Intel's Threading Building Blocks. The latter are highly tuned to the number of available processing cores, while programs using hyperqueues are scale-free.

References

[1]
R. Bocchino, V. Adve, S. Adve, and M. Snir, "Parallel programming must be deterministic by default," in HotPar, 2009.
[2]
R. L. Bocchino, Jr., S. Heumann, N. Honarmand, S. V. Adve, V. S. Adve, A. Welc, and T. Shpeisman, "Safe nondeterminism in a deterministic-by-default parallel language," in POPL, 2011.
[3]
J. C. Jenista, Y. h. Eom, and B. C. Demsky, "OoOJava: software out-of-order execution," in PPoPP, 2011.
[4]
V. Cavé, J. Zhao, J. Shirako, and V. Sarkar, "Habanero-java: The new adventures of old X11," in Principles and Practice of Programming in Java, 2011.
[5]
M. Bauer, S. Treichler, E. Slaughter, and A. Aitken, "Legion: Expressing locality and independence with logical regions," in SC, 2012.
[6]
H. Vandierendonck, G. Tzenakis, and D. S. Nikolopoulos, "A unified scheduler for recursive and task dataflow parallelism," in PACT, 2011.
[7]
H. Vandierendonck, P. Pratikakis, and D. S. Nikolopoulos, "Parallel programming of general-purpose programs using task-based programming models," in HotPar, 2011.
[8]
M. Frigo, C. E. Leiserson, and K. H. Randall, "The implementation of the Cilk-5 multi-threaded language," in PLDI, 1998.
[9]
M. Frigo, P. Halpern, C. E. Leiserson, and S. Lewin-Berlin, "Reducers and other Cilk++ hyperobjects," in SPAA, 2009.
[10]
P. Pratikakis, H. Vandierendonck, S. Lyberis, and D. S. Nikolopoulos, "A programming model for deterministic task parallelism," in Workshop on Memory Systems Performance and Correctness, 2011.
[11]
L. Lamport, "Specifying concurrent program modules," ACM Trans. Program. Lang. Syst., vol. 5, no. 2, pp. 190--222, Apr. 1983.
[12]
J. Valois, "Implementing lock-free queues," in Proc. of the 7th Intl. Conf. on Parallel and Distributed Computing Systems, 1994.
[13]
D. Lea, "The JSR-133 cookbook for compiler writers," 2011.
[14]
J. Giacomoni, T. Moseley, and M. Vachharajani, "Fastforward for efficient pipeline parallelism: a cache-optimized concurrent lock-free queue," in PPoPP, 2008.
[15]
H. Attiya, R. Guerraoui, D. Hendler, P. Kuznetsov, M. M. Michael, and M. Vechev, "Laws of order: expensive synchronization in concurrent algorithms cannot be eliminated," in POPL, 2011.
[16]
M. M. Michael and M. L. Scott, "Simple, fast, and practical non-blocking and blocking concurrent queue algorithms," in PODC, 1996.
[17]
P. Fatourou and N. D. Kallimanis, "A highly-efficient wait-free universal construction," in SPAA, 2011.
[18]
E. G. Coffman, M. Elphick, and A. Shoshani, "System deadlocks," ACM Comput. Surv., vol. 3, no. 2, pp. 67--78, 1971.
[19]
"Cilk 5.4.6 reference manual," http://supertech.csail.mit.edu/cilk/manual-5.4.6.pdf, 1998.
[20]
A. Robinson, "Detecting theft by hyperobject abuse," http://software.intel.com/en-us/blogs/2010/11/22/detecting-theft-by-hyperobject-abuse/, 2010.
[21]
C. Biena, "Benchmarking modern multiprocessors," Ph.D. dissertation, Princeton University, Jan. 2011.
[22]
E. C. Reed, N. Chen, and R. E. Johnson, "Expressing pipeline parallelism using TBB constructs," in Workshop on Transitioning to Multicore, 2011.
[23]
Intel Threading Building Blocks, Intel, Sep. 2010, document Number 319872--006US.
[24]
A. Katranov, "Deterministic reduction: a new community preview feature in Intel threading building blocks," http://software.intel.com/en-us/blogs/2012/05/11/deter-ministic-reduction-a-new-community-preview-feature-in-intel-threading-building-blocks, 2012.
[25]
W. Thies, M. Karczmarek, and S. P. Amarasinghe, "Streamit: A language for streaming applications," in CC, 2002.
[26]
D. Sanchez, D. Lo, R. M. Yoo, J. Sugerman, and C. Kozyrakis, "Dynamic fine-grain scheduling of pipeline parallelism," in PACT, 2011.
[27]
J. Shirako, D. M. Peixotto, V. Sarkar, and W. N. Scherer, "Phasers: a unified deadlock-free construct for collective and point-to-point synchronization," in ICS, 2008.
[28]
A. Pop and A. Cohen, "Openstream: Expressiveness and data-flow compilation of openmp streaming programs," ACM Trans. Archit. Code Optim., vol. 9, no. 4, pp. 53:1--53:25, 2013.
[29]
P. An, A. Jula, S. Rus, S. Saunders, T. Smith, G. Tanase, N. Thomas, N. Amato, and L. Rauchwerger, "STAPL: an adaptive, generic parallel C++ library," in LCPC, 2003.
[30]
D. Lea, "Concurrency JSR-166 interest site," http://gee.cs.oswego.edu/dl/concurrency-interest/.
[31]
M. P. Herlihy and J. M. Wing, "Linearizability: a correctness condition for concurrent objects," ACM Trans. Program. Lang. Syst., vol. 12, no. 3, pp. 463--492, 1990.
[32]
A. Navarro, R. Asenjo, S. Tabik, and C. Cascaval, "Analytical modeling of pipeline parallelism," in PACT, 2009.
[33]
S. Macdonald, D. Szafron, and J. Schaeffer, "Rethinking the pipeline as object-oriented states with transformations," in Intl. Workshop on High-Level Parallel Programming Models and Supportive Environments (HIPS) at IPDPS, 2004.
[34]
A. Raman, H. Kim, T. Oh, J. W. Lee, and D. I. August, "Parallelism orchestration using DoPE: the degree of parallelism executive," in PLDI, 2011.
[35]
M. A. Suleman, M. K. Qureshi, Khubaib, and Y. N. Patt, "Feedback-directed pipeline parallelism," in PACT, 2010.

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
SC '13: Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
November 2013
1123 pages
ISBN:9781450323789
DOI:10.1145/2503210
  • General Chair:
  • William Gropp,
  • Program Chair:
  • Satoshi Matsuoka
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 17 November 2013

Permissions

Request permissions for this article.

Check for updates

Qualifiers

  • Research-article

Funding Sources

Conference

SC13
Sponsor:

Acceptance Rates

SC '13 Paper Acceptance Rate 91 of 449 submissions, 20%;
Overall Acceptance Rate 1,516 of 6,373 submissions, 24%

Upcoming Conference

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)8
  • Downloads (Last 6 weeks)0
Reflects downloads up to 06 Jan 2025

Other Metrics

Citations

Cited By

View all

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media