skip to main content
research-article
Open access

PLDS: Partitioning linked data structures for parallelism

Published: 26 January 2012 Publication History

Abstract

Recently, parallelization of computations in the presence of dynamic data structures has shown promising potential. In this paper, we present PLDS, a system for easily expressing and efficiently exploiting parallelism in computations that are based on dynamic linked data structures. PLDS improves the execution efficiency by providing support for data partitioning and then distributing computation across threads based on the partitioning. Such computations often require the use of speculation to exploit dynamic parallelism. PLDS supports a conditional speculation mechanism that reduces the cost of speculation. PLDS can be employed in the context of different forms of parallelism, which to cover a wide range of parallel applications. PLDS provides easy-to-use compiler directives, using enabling programmers to choose from among a variety of data partitionings to distribute computation across threads in a partitioning-sensitive fashion, and to use conditional speculation when required. We evaluate our implementation of PLDS using ten benchmarks, of which six are parallelized using speculation. PLDS achieves 1.3x--6.9x speedups on an 8-core machine.

References

[1]
Anderson, J. M. and Lam, M. S. 1993. Global optimizations for parallelism and locality on scalable parallel machines. In Proceedings of PLDI.
[2]
Barnes, J. and Hut, P. 1986. A hierarchical O(N log N) force-calculation algorithm. Nature 324, 4, 446--559.
[3]
Boman, E. B., Bozdaǧ, D., Catalyurek, U., Gebremedhin, A. H., and Manne, F. 2005. A scalable parallel graph coloring algorithm for distributed memory computers. In Proceedings of EURO-PAR. 241--251.
[4]
Carlisle, M. C. and Rogers, A. 1995. Software caching and computation migration in olden. In Proceedings of PPoPP. 29--38.
[5]
Chamberlain, B. L., Callahan, D., and Zima, H. P. 2007. Parallel programmability and the chapel language. Int. J. High Perform. Comput. Appl. 21, 3, 291--312.
[6]
Charles, P., Grothoff, C., Saraswat, V., Donawa, C., Kielstra, A., Ebcioglu, K., von Praun, C., and Sarkar, V. 2005. X10: an object-oriented approach to non-uniform cluster computing. In Proceedings of OOPSLA. 519--538.
[7]
Consortium, U. 2005. Upc language specifications, v1.2. tech. rep. LBNL-59208, Lawrence Berkeley National Lab.
[8]
Dagum, L. and Menon, R. 1998. Openmp: An industry-standard api for shared-memory programming. IEEE Computat. Science Engin. 5, 1, 46--55.
[9]
Ding, C., Shen, X., Kelsey, K., Tice, C., Huang, R., and Zhang, C. 2007. Software behavior oriented parallelization. In Proceedings of PLDI. 223--234.
[10]
Dotsenko, Y., Coarfa, C., and Mellor-Crummey, J. 2004. A multi-platform co-array fortran compiler. In Proceedings of PACT.
[11]
Feng, M., Gupta, R., and Hu, Y. 2011. Spicec: scalable parallelism via implicit copying and explicit commit. In Proceedings of PPoPP. 69--80.
[12]
Gordon, M. I., Thies, W., and Amarasinghe, S. 2006. Exploiting coarse-grained task, data, and pipeline parallelism in stream programs. In Proceedings of ASPLOS. 151--162.
[13]
Gropp, W., Lusk, E., and Skjellum, A. 1994. Using MPI: Portable Parallel Programming with the Message Passing Interface. The MIT Press.
[14]
Hilfinger, P., Bonachea, D., Datta, K., Gay, D., Graham, S., Liblit, B., Pike, G., Su, J., and Yelick, K. 2005. Titanium language reference manual. tech. rep., UCB/EECS-2005-15, University of California Berkeley.
[15]
Hudson, B., Miller, G. L., and Phillips, T. 2007. Sparse parallel delaunay mesh refinement. In Proceedings of SPAA. 339--347.
[16]
Intel. 2010. Intel STM compiler prototype edition. http://whatif.intel.com/.
[17]
Kandemir, M., Yemliha, T., Muralidhara, S., Srikantaiah, S., Irwin, M. J., and Zhnag, Y. 2010. Cache topology aware computation mapping for multicores. In Proceedings of PLDI.
[18]
Karypis, G. and Kumar, V. 1998. Multilevel k-way partitioning scheme for irregular graphs. J. Parall. Distrib. Comput. 48, 1, 96--129.
[19]
Kelsey, K., Bai, T., Ding, C., and Zhang, C. 2009. Fast track: A software system for speculative program optimization. In Proceedings of CGO. 157--168.
[20]
Kennedy, K. and Allen, J., Eds. 2001. Optimizing Compilers for Modern Architectures: A Dependence-Based Approach. Morgan Kaufmann.
[21]
Krishnan, V. and Torrellas, J. 1999. A chip-multiprocessor architecture with speculative multithreading. IEEE Trans. Comput. 48, 9, 866--880.
[22]
Kulkarni, M., Burtscher, M., Cascaval, C., and Pingali, K. 2009. Lonestar: A suite of parallel irregular programs. In Proceedings of ISPASS. 65--76.
[23]
Kulkarni, M., Pingali, K., Ramanarayanan, G., Walter, B., Bala, K., and Chew, L. P. 2008. Optimistic parallelism benefits from data partitioning. In Proceedings of ASPLOS.
[24]
Lin, F.-H., Wang, F.-N., Ahlfors, S. P., Hämäläinen, M. S., and Belliveau, J. W. 2007. Parallel MRI reconstruction using variance partitioning regularization. Magnetic Resonance Medicine 58, 4, 735--744.
[25]
Mellor-Crummey, J. M. and Scott, M. L. 1991. Algorithms for scalable synchronization on shared-memory multiprocessors. ACM Trans. Comput. Syst. 9, 1, 21--65.
[26]
Müller, A. and Rühl, R. 1995. Extendings high performance fortran for the support of unstructured computations. In Proceedings of ICS. 127--136.
[27]
Quinlan, D. 2000. Rose: Compiler support for object-oriented framework. In Proceedings of CPC.
[28]
Reinders, J. 2007. Intel Threading Building Blocks: Outfitting C++ for Multi-Core Processor. O'Reilly Media.
[29]
Rogers, A. and Pingali, K. 1989. Process decomposition through locality of reference. In Proceedings of PLDI. 69--80.
[30]
Scott, M., Spear, M. F., Dalessandro, L., and Marathe, V. J. 2007. Delaunay triangulation with transactions and barriers. In Proceedings of IISWC.
[31]
Tian, C., Feng, M., and Gupta, R. 2008. Copy or discard execution model for speculative parallelization on multicores. In Proceedings of MICRO. 330--341.
[32]
Tian, C., Feng, M., and Gupta, R. 2010. Supporting speculative parallelization in the presence of dynamic data structures. In Proceedings of PLDI. 62--73.
[33]
Utgoff, P. E., Berkman, N. C., and Clouse, J. A. 1997. Decision tree induction based on efficient tree restructuring. Mach. Learn. 29, 1, 5--44.
[34]
wei Liao, S., Du, Z., Wu, G., and Lueh, G.-Y. 2006. Data and computation transformations for brook streaming applications on multiprocessors. In Proceedings of CGO.
[35]
Zeng, X., Bagrodia, R., and Gerla, M. 1998. GloMoSim: a library for parallel simulation of large-scale wireless networks. In Proceedings of PADS.

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Architecture and Code Optimization
ACM Transactions on Architecture and Code Optimization  Volume 8, Issue 4
Special Issue on High-Performance Embedded Architectures and Compilers
January 2012
765 pages
ISSN:1544-3566
EISSN:1544-3973
DOI:10.1145/2086696
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 26 January 2012
Accepted: 01 November 2011
Revised: 01 October 2011
Received: 01 July 2011
Published in TACO Volume 8, Issue 4

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Linked data structures
  2. parallel programming
  3. parallelization
  4. partitioning
  5. speculation

Qualifiers

  • Research-article
  • Research
  • Refereed

Funding Sources

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)68
  • Downloads (Last 6 weeks)17
Reflects downloads up to 29 Jan 2025

Other Metrics

Citations

Cited By

View all

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Full Access

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media