Article

Data driven workflow planning in cluster management systems

Authors:

Srinath Shankar,

David J. DeWittAuthors Info & Claims

HPDC '07: Proceedings of the 16th international symposium on High performance distributed computing

Pages 127 - 136

https://doi.org/10.1145/1272366.1272383

Published: 25 June 2007 Publication History

Abstract

Traditional scientific computing has been associated with harnessing computation cycles within and across clusters of machines. In recent years, scientific applications have become increasingly data-intensive. This is especially true in the fields of astronomy and high energy physics. Furthermore, the lowered cost of disks and commodity machines has led to a dramatic increase in the amount of free disk space spread across machines in a cluster. This space is not being exploited by traditional distributed computing tools. In this paper we have evaluated ways to improve the data management capabilities of Condor, a popular distributed computing system. We have augmented the Condor system by providing the capability to store data used and produced by workflows on the disks of machines in the cluster. We have also replaced the Condor matchmaker with a new workflow planning framework that is cognizant of dependencies between jobs in a workflow and exploits these new data storage capabilities to produce workflow schedules. We show that our data caching and workflow planning framework can significantly reduce response times for data-intensive workflows by reducing data transfer over the network in a cluster. We also consider ways in which this planning framework can be made adaptive in a dynamic, multi-user, failure-prone environment.

References

[1]

Biomedical informatics research network. http://www.nbirn.net.

[2]

Condor fair share scheduling. http://www.cs.wisc.edu/condor/manual/v6.7/ 3 5User Priorities.html.

[3]

Grid physics network. http://www.griphyn.org.

[4]

Grid physics network in atlas. http://www.usatlas.bnl.gov/computing/grid/griphyn/.

[5]

Ncbi blast. http://www.ncbi.nlm.nih.gov/BLAST/.

[6]

Sloan Digital Sky Survey. http://www.sdss.org.

[7]

A. Adya et al. Farsite: Federated, Available, and Reliable Storage for an Incompletely Trusted Environment. SIGOPS Oper. Syst. Rev., 36(SI):1--14, 2002.

Digital Library

[8]

J. Bent, D. Thain, A. Arpaci-Dusseau, and R. Arpaci-Dusseau. Explicit Control in the Batch-Aware Distributed File System. In NSDI, pages 365--378, 2004.

Digital Library

[9]

J. Blythe et al. Task Scheduling Strategies for Workflow-based Applications in Grids. CCGrid 2005, 2005.

Digital Library

[10]

L. Bright and D. Maier. Efficient Scheduling and Execution of Scientific Workflow Tasks. In SSDBM, pages 65--78, 2005.

Digital Library

[11]

A. L. Chervenak et al. Giggle: A Framework for Constructing Scalable Replica Location Services. In SC, pages 1--17, 2002.

Digital Library

[12]

E. Deelman, J. Blythe, et al. Pegasus: Mapping scientific workflows onto the grid. In European Across Grids Conference, pages 11--20, 2004.

[13]

D. DeWitt et al. The Gamma Database Machine Project. IEEE Trans. Knowl. Data Eng., 2(1):44--62, 1990.

Digital Library

[14]

I. T. Foster, J.-S. Vöckler, M. Wilde, and Y. Zhao. Chimera: A Virtual Data System for Representing, Querying, and Automating Data Derivation. In SSDBM, pages 37--46, 2002.

Digital Library

[15]

J.-J. Hwang et al. Scheduling precedence graphs in systems with interprocessor communication times. SIAM J. Comput., 18(2):244--257, 1989.

Digital Library

[16]

Y. E. Ioannidis et al. ZOO: A Desktop Experiment Management Environment. In SIGMOD Conference, pages 580--583, 1997.

Digital Library

[17]

T. Kosar and M. Livny. Stork: Making Data Placement a First Class Citizen in the Grid. In ICDCS, pages 342--349, 2004.

Digital Library

[18]

Y.-K. Kwok and I. Ahmad. Benchmarking and Comparison of the Task Graph Scheduling Algorithms. Journal of Parallel and Distributed Computing, 59(3):381--422, 1999.

Digital Library

[19]

D. Lee et al. LRFU: A Spectrum of Policies that Subsumes the Least Recently Used and Least Frequently Used Policies. IEEE Trans. Computers, 50(12):1352--1361, 2001.

Digital Library

[20]

D. T. Liu and M. J. Franklin. The Design of GridDB: A Data-Centric Overlay for the Scientific Grid. In VLDB, pages 600--611, 2004.

Digital Library

[21]

G. M. Lohman et al. Query processing in R*. In Query Processing in Database Systems, pages 31--47. Springer, 1985.

[22]

M. A. Nieto-Santisteban et al. When Database Systems Meet the Grid. In CIDR, pages 154--161, 2005.

[23]

J. Quarfoth, A. Korth, and D. Lopez. Task Allocation Algorithms with Communication costs considered. Midwest Instruction and Computing Symposium, 2005.

[24]

K. Ranganathan et al. Improving Data Availability through Dynamic Model-Driven Replication in Large Peer-to-Peer Communities. In CCGRID, pages 376--381, 2002.

Digital Library

[25]

A. Romosan, D. Rotem, A. Shoshani, and D. Wright. Co-Scheduling of Computation and Data on Computer Clusters. In SSDBM, pages 103--112, 2005.

Digital Library

[26]

D. Thain, J. Bent, A. C. Arpaci-Dusseau, R. H. Arpaci-Dusseau, and M. Livny. Pipeline and Batch Sharing in Grid Workloads. In Proceedings of High-Performance Distributed Computing (HPDC-12), pages 152--161, Seattle, Washington, June 2003.

Digital Library

[27]

M. Wieczorek, R. Prodan, and T. Fahringer. Scheduling of scientific workflows in the ASKALON grid environment. SIGMOD Record, 34(3):56--62, 2005.

Digital Library

Cited By

Roy RPatel TGadepally VTiwari DLee JAgrawal KSpear M(2022)MashupProceedings of the 27th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming10.1145/3503221.3508407(46-60)Online publication date: 2-Apr-2022
https://dl.acm.org/doi/10.1145/3503221.3508407
Oliveira DLiu JPacitti E(2019)Data-Intensive Workflow Management: For Clouds and Data-Intensive and Scalable Computing EnvironmentsSynthesis Lectures on Data Management10.2200/S00915ED1V01Y201904DTM06014:4(1-179)Online publication date: 13-May-2019
https://doi.org/10.2200/S00915ED1V01Y201904DTM060
Mao MHumphrey M(2016)Resource Provisioning in the CloudWeb-Based Services10.4018/978-1-4666-9466-8.ch095(2159-2181)Online publication date: 2016
https://doi.org/10.4018/978-1-4666-9466-8.ch095
Show More Cited By

Index Terms

Data driven workflow planning in cluster management systems
1. Information systems
  1. Information retrieval
    1. Search engine architectures and scalability
      1. Distributed retrieval
      2. Peer-to-peer retrieval
  2. Information storage systems
    1. Storage architectures
      1. Distributed storage
2. Software and its engineering
  1. Software organization and properties
    1. Software system structures
      1. Distributed systems organizing principles
        Organizing principles for web applications

Recommendations

What makes workflows work in an opportunistic environment?: Research Articles
Workflow in Grid Systems

In this paper, we examine the issues of workflow mapping and execution in opportunistic environments such as the Grid. As applications become ever more complex, the process of choosing the appropriate resources and successfully executing the application ...
A Survey of Data-Intensive Scientific Workflow Management

Nowadays, more and more computer-based scientific experiments need to handle massive amounts of data. Their data processing consists of multiple computational steps and dependencies within them. A data-intensive scientific workflow is useful for ...
A Workflow Management Platform for Scientific Applications in Grid Environments
SYNASC '10: Proceedings of the 2010 12th International Symposium on Symbolic and Numeric Algorithms for Scientific Computing

Workflow management systems allow the development of complex applications at a higher level, by orchestrating functional components without handling the implementation details. Although a wide range of workflow engines are developed in enterprise ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

HPDC '07: Proceedings of the 16th international symposium on High performance distributed computing

June 2007

256 pages

ISBN:9781595936738

DOI:10.1145/1272366

General Chair:
Carl Kesselman
USC/ISI
,
Program Chairs:
Jack Dongarra
University of Tennessee
,
David Walker
University of Cardiff

Copyright © 2007 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 25 June 2007

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Article

Conference

HPDC07

Sponsor:

HPDC07: International Symposium on High Performance Distributed Computing

June 25 - 29, 2007

California, Monterey, USA

Acceptance Rates

Overall Acceptance Rate 166 of 966 submissions, 17%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

27
Total Citations
View Citations
811
Total Downloads

Downloads (Last 12 months)1
Downloads (Last 6 weeks)0

Reflects downloads up to 07 Nov 2024

Other Metrics

View Author Metrics

Citations

Cited By

Roy RPatel TGadepally VTiwari DLee JAgrawal KSpear M(2022)MashupProceedings of the 27th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming10.1145/3503221.3508407(46-60)Online publication date: 2-Apr-2022
https://dl.acm.org/doi/10.1145/3503221.3508407
Oliveira DLiu JPacitti E(2019)Data-Intensive Workflow Management: For Clouds and Data-Intensive and Scalable Computing EnvironmentsSynthesis Lectures on Data Management10.2200/S00915ED1V01Y201904DTM06014:4(1-179)Online publication date: 13-May-2019
https://doi.org/10.2200/S00915ED1V01Y201904DTM060
Mao MHumphrey M(2016)Resource Provisioning in the CloudWeb-Based Services10.4018/978-1-4666-9466-8.ch095(2159-2181)Online publication date: 2016
https://doi.org/10.4018/978-1-4666-9466-8.ch095
Liu JPacitti EValduriez PMattoso M(2015)A Survey of Data-Intensive Scientific Workflow ManagementJournal of Grid Computing10.1007/s10723-015-9329-813:4(457-493)Online publication date: 1-Dec-2015
https://dl.acm.org/doi/10.1007/s10723-015-9329-8
Mao MHumphrey M(2014)Resource Provisioning in the CloudHandbook of Research on Architectural Trends in Service-Driven Computing10.4018/978-1-4666-6178-3.ch023(589-612)Online publication date: 2014
https://doi.org/10.4018/978-1-4666-6178-3.ch023
Tanaka MTatebe O(2014)Disk cache-aware task scheduling for data-intensive and many-task workflow2014 IEEE International Conference on Cluster Computing (CLUSTER)10.1109/CLUSTER.2014.6968774(167-175)Online publication date: Sep-2014
https://doi.org/10.1109/CLUSTER.2014.6968774
Pandey SBuyya R(2013)A Survey of Scheduling and Management Techniques for Data-Intensive Application WorkflowsEnterprise Resource Planning10.4018/978-1-4666-4153-2.ch066(1170-1190)Online publication date: 2013
https://doi.org/10.4018/978-1-4666-4153-2.ch066
Pandey SBuyya R(2012)A Survey of Scheduling and Management Techniques for Data-Intensive Application WorkflowsData Intensive Distributed Computing10.4018/978-1-61520-971-2.ch007(156-176)Online publication date: 2012
https://doi.org/10.4018/978-1-61520-971-2.ch007
Zheng YPang JLi JCui L(2012)Business Process Oriented Platform-as-a-Service Framework for Process Instances Intensive ApplicationsProceedings of the 2012 IEEE 26th International Parallel and Distributed Processing Symposium Workshops & PhD Forum10.1109/IPDPSW.2012.284(2320-2327)Online publication date: 21-May-2012
https://dl.acm.org/doi/10.1109/IPDPSW.2012.284
Zerola MLauret JBarták RŠumbera M(2012)One click dataset transfer: toward efficient coupling of distributed storage resources and CPUsJournal of Physics: Conference Series10.1088/1742-6596/368/1/012022368(012022)Online publication date: 21-Jun-2012
https://doi.org/10.1088/1742-6596/368/1/012022
Show More Cited By

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents