skip to main content
10.1145/3149869.3149870acmconferencesArticle/Chapter ViewAbstractPublication PagesscConference Proceedingsconference-collections
research-article

Enabling Python to execute efficiently in heterogeneous distributed infrastructures with PyCOMPSs

Published: 12 November 2017 Publication History

Abstract

Python has been adopted as programming language by a large number of scientific communities. Additionally to the easy programming interface, the large number of libraries and modules that have been made available by a large number of contributors, have taken this language to the top of the list of the most popular programming languages in scientific applications. However, one main drawback of Python is the lack of support for concurrency or parallelism. PyCOMPSs is a proved approach to support task-based parallelism in Python that enables applications to be executed in parallel in distributed computing platforms.
This paper presents PyCOMPSs and how it has been tailored to execute tasks in heterogeneous and multi-threaded environments. We present an approach to combine the task-level parallelism provided by PyCOMPSs with the thread-level parallelism provided by MKL. Performance and behavioral results in distributed computing heterogeneous clusters show the benefits and capabilities of PyCOMPSs in both HPC and Big Data infrastructures.

References

[1]
[n. d.]. Intel Math Kernel Library. Reference Manual. Intel Corporation. Santa Clara, USA. ISBN 630813-054US.
[2]
(Date of last access: 10th October, 2017). Parallel Processing and Multiprocessing in Python. Web page at https://wiki.python.org/moin/ParallelProcessing. ((Date of last access: 10th October, 2017)).
[3]
(Date of last access: 10th October, 2017). Parallel Python Software. Web page at http://www.parallelpython.com. ((Date of last access: 10th October, 2017)).
[4]
(Date of last access: 19th December, 2016). Extrae. Web page at https://tools.bsc.es/extrae. ((Date of last access: 19th December, 2016)).
[5]
(Date of last access: 19th December, 2016). Paraver: a flexible performance analysis tool. Web page at https://tools.bsc.es/paraver. ((Date of last access: 19th December, 2016)).
[6]
(Date of last access: 21st August, 2017). Architecting a High Performance Storage System. Web page at https://www.intel.com/content/dam/www/public/us/en/documents/white-papers/architecting-lustre-storage-white-paper.pdf. ((Date of last access: 21st August, 2017)).
[7]
(Date of last access: 21st August, 2017). Intel ®Xeon ®Processor E5-2600 Series. Web page at http://download.intel.com/support/processors/xeon/sb/xeon_E5-2600.pdf. ((Date of last access: 21st August, 2017)).
[8]
(Date of last access: 21st August, 2017). MareNostrum III UserâĂŹs Guide. Web page at https://www.bsc.es/support/MareNostrum3-ug.pdf. ((Date of last access: 21st August, 2017)).
[9]
(Date of last access: 28th August, 2017). BLAS (Basic Linear Algebra Subprograms). Web page at http://www.netlib.org/blas/. ((Date of last access: 28th August, 2017)).
[10]
(Date of last access: 28th August, 2017). Threading Building Blocks (Intel®TBB). Web page at https://www.threadingbuildingblocks.org/. ((Date of last access: 28th August, 2017)).
[11]
(Date of last access: 6th October, 2017). PySpark (The Spark Python API). Web page at https://spark.apache.org/docs/latest/api/python/index.html. ((Date of last access: 6th October, 2017)).
[12]
Emmanuel Agullo, Jim Demmel, Jack Dongarra, Bilel Hadri, Jakub Kurzak, Julien Langou, Hatem Ltaief, Piotr Luszczek, and Stanimire Tomov. 2009. Numerical linear algebra on emerging architectures: The PLASMA and MAGMA projects. Journal of Physics: Conference Series 180, 1 (2009), 012037. http://stacks.iop.org/1742-6596/180/i=1/a=012037
[13]
Edward Anderson, Zhaojun Bai, Christian Bischof, L Susan Blackford, James Demmel, Jack Dongarra, Jeremy Du Croz, Anne Greenbaum, Sven Hammarling, Alan McKenney, et al. 1999. LAPACK Users' guide. SIAM.
[14]
Paolo Bientinesi, Brian Gunter, and Robert A. van de Geijn. 2008. Families of Algorithms Related to the Inversion of a Symmetric Positive Definite Matrix. ACM Trans. Math. Softw. 35, 1, Article 3 (July 2008), 22 pages. https://doi.org/10.1145/1377603.1377606
[15]
L. S. Blackford, J. Choi, A. Cleary, E. D'Azevedo, J. Demmel, I. Dhillon, J. Dongarra, S. Hammarling, G. Henry, A. Petitet, K. Stanley, D. Walker, and R. C. Whaley. 1997. ScaLAPACK Users' Guide. Society for Industrial and Applied Mathematics.
[16]
Ernie Chan, Field G. Van Zee, Paolo Bientinesi, Enrique S. Quintana-Orti, Gregorio Quintana-Orti, and Robert van de Geijn. 2008. SuperMatrix: a multithreaded runtime scheduling system for algorithms-by-blocks. In Proceedings of the 13th ACM SIGPLAN Symposium on Principles and practice of parallel programming (PPoPP '08). ACM, New York, NY, USA, 123--132. https://doi.org/10.1145/1345206.1345227
[17]
Shuai Che, Michael Boyer, Jiayuan Meng, David Tarjan, Jeremy W. Sheaffer, Sang-Ha Lee, and Kevin Skadron. 2009. Rodinia: A Benchmark Suite for Heterogeneous Computing. In Proceedings of the 2009 IEEE International Symposium on Workload Characterization (IISWC) (IISWC '09). IEEE Computer Society, Washington, DC, USA, 44--54. https://doi.org/10.1109/IISWC.2009.5306797
[18]
J. Conejero, S. Corella, Rosa M. Badia, and J. Labarta. 2017. Task-based programming in COMPSs to converge from HPC to big data. International journal of high performance computing applications (Apr 2017). https://doi.org/10.1177/1094342017701278
[19]
Leonardo Dagum and Ramesh Menon. 1998. OpenMP: An Industry-Standard API for Shared-Memory Programming. IEEE Comput. Sci. Eng. 5, 1 (Jan. 1998), 46--55. https://doi.org/10.1109/99.660313
[20]
Lisandro DalcÃnn, Rodrigo Paz, and Mario Storti. 2005. MPI for Python. J. Parallel and Distrib. Comput. (2005). http://www.sciencedirect.com/science/article/pii/S0743731505000560
[21]
Dask Development Team. 2016. Dask: Library for dynamic task scheduling. http://dask.pydata.org
[22]
James W. Demmel and Nicholas J. Higham. 1992. Stability of Block Algorithms with Fast Level-3 BLAS. ACM Trans. Math. Softw. 18, 3 (Sept. 1992), 274--291. https://doi.org/10.1145/131766.131769
[23]
Karim Djemame, Django Armstrong, Richard E. Kavanagh, Jean-Christophe Deprez, Ana Juan Ferrer, David García-Pérez, Rosa M. Badia, Raúl Sirvent, Jorge Ejarque, and Yiannis Georgiou. 2016. TANGO: Transparent heterogeneous hardware Architecture deployment for eNergy Gain in Operation. CoRR abs/1603.01407 (2016). http://arxiv.org/abs/1603.01407
[24]
Gene H. Golub and Charles F. Van Loan. 1996. Matrix Computations (3rd Ed.). Johns Hopkins University Press, Baltimore, MD, USA.
[25]
John A. Gunnels, Fred G. Gustavson, Greg M. Henry, and Robert A. van de Geijn. 2001. FLAME: Formal Linear Algebra Methods Environment. ACM Trans. Math. Softw. 27, 4 (Dec. 2001), 422--455. https://doi.org/10.1145/504210.504213
[26]
Eric Jones, Travis Oliphant, and Pearu Peterson. 2014. {SciPy}: open source scientific tools for {Python}. (2014).
[27]
Sheng Liang. 1999. Java Native Interface: Programmer's Guide and Reference (1st ed.). Addison-Wesley Longman Publishing Co., Inc., Boston, MA, USA.
[28]
Francesc Lordan, Enric Tejedor, Jorge Ejarque, Roger Rafanell, Javier Álvarez, Fabrizio Marozzo, Daniele Lezzi, Raúl Sirvent, Domenico Talia, and Rosa M. Badia. 2014. ServiceSs: An Interoperable Programming Framework for the Cloud. J. Grid Comput. 12, 1 (2014), 67--91. https://doi.org/10.1007/s10723-013-9272-5
[29]
Hatem Ltaief, Stanimire Tomov, Rajib Nath, Peng Du, and Jack Dongarra. 2011. A Scalable High Performant Cholesky Factorization for Multicore with GPU Accelerators. In Proceedings of the 9th International Conference on High Performance Computing for Computational Science (VECPAR'10). Springer-Verlag, Berlin, Heidelberg, 93--101. http://dl.acm.org/citation.cfm?id=1964238.1964251
[30]
Wes McKinney. [n. d.]. pandas: a Foundational Python Library for Data Analysis and Statistics. ([n. d.]).
[31]
Vincent Pillet et al. 1995. Paraver: A Tool to Visualize and Analyze Parallel Code. Transputer and occam Developments (April 1995), 17--32. http://www.bsc.es/paraver - Accessed April, 2012.
[32]
Enrique S. Quintana-Ortí and Robert A. van de Geijn. 2008. Updating an LU Factorization with Pivoting. ACM Trans. Math. Softw. 35, 2, Article 11 (July 2008), 16 pages. https://doi.org/10.1145/1377612.1377615
[33]
Gregorio Quintana-Orti, Enrique S. Quintana-Orti, Ernie Chan, Robert A. van de Geijn, and Field G. Van Zee. 2008. Scheduling of QR Factorization Algorithms on SMP and Multi-Core Architectures. In Proceedings of the 16th Euromicro Conference on Parallel, Distributed and Network-Based Processing (PDP 2008) (PDP '08). IEEE Computer Society, Washington, DC, USA, 301--310. https://doi.org/10.1109/PDP.2008.37
[34]
Enric Tejedor, Yolanda Becerra, Guillem Alomar, Anna Queralt, Rosa M Badia, Jordi Torres, Toni Cortes, and Jesús Labarta. 2017. Pycompss: Parallel computational workflows in python. The International Journal of High Performance Computing Applications 31, 1 (2017), 66--82.
[35]
Guido Van Rossum and Fred L Drake. 2003. Python language reference manual. Network Theory.
[36]
Stefan van der Walt, S. Chris Colbert, and Gael Varoquaux. 2011. The NumPy Array: A Structure for Efficient Numerical Computation. Computing in Science and Engg. 13, 2 (March 2011), 22--30. https://doi.org/10.1109/MCSE.2011.37
[37]
Matei Zaharia, Mosharaf Chowdhury, Michael J. Franklin, Scott Shenker, and Ion Stoica. 2010. Spark: Cluster Computing with Working Sets. In Proceedings of the 2Nd USENIX Conference on Hot Topics in Cloud Computing. Berkeley, CA, USA.

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
PyHPC'17: Proceedings of the 7th Workshop on Python for High-Performance and Scientific Computing
November 2017
81 pages
ISBN:9781450351249
DOI:10.1145/3149869
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 12 November 2017

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Big Data
  2. HPC
  3. Heterogeneous infrastructures
  4. Linear Algebra
  5. Python

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Conference

SC '17
Sponsor:

Acceptance Rates

Overall Acceptance Rate 7 of 7 submissions, 100%

Upcoming Conference

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)4
  • Downloads (Last 6 weeks)0
Reflects downloads up to 01 Jan 2025

Other Metrics

Citations

Cited By

View all

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media