research-article

Modeling and synthesizing task placement constraints in Google compute clusters

Authors:

Bikash Sharma,

Victor Chudnovsky,

Joseph L. Hellerstein,

Rasekh Rifaat,

Chita R. DasAuthors Info & Claims

SOCC '11: Proceedings of the 2nd ACM Symposium on Cloud Computing

Article No.: 3, Pages 1 - 14

https://doi.org/10.1145/2038916.2038919

Published: 26 October 2011 Publication History

Get Access

Abstract

Evaluating the performance of large compute clusters requires benchmarks with representative workloads. At Google, performance benchmarks are used to obtain performance metrics such as task scheduling delays and machine resource utilizations to assess changes in application codes, machine configurations, and scheduling algorithms. Existing approaches to workload characterization for high performance computing and grids focus on task resource requirements for CPU, memory, disk, I/O, network, etc. Such resource requirements address how much resource is consumed by a task. However, in addition to resource requirements, Google workloads commonly include task placement constraints that determine which machine resources are consumed by tasks. Task placement constraints arise because of task dependencies such as those related to hardware architecture and kernel version.

This paper develops methodologies for incorporating task placement constraints and machine properties into performance benchmarks of large compute clusters. Our studies of Google compute clusters show that constraints increase average task scheduling delays by a factor of 2 to 6, which often results in tens of minutes of additional task wait time. To understand why, we extend the concept of resource utilization to include constraints by introducing a new metric, the Utilization Multiplier (UM). UM is the ratio of the resource utilization seen by tasks with a constraint to the average utilization of the resource. UM provides a simple model of the performance impact of constraints in that task scheduling delays increase with UM. Last, we describe how to synthesize representative task constraints and machine properties, and how to incorporate this synthesis into existing performance benchmarks. Using synthetic task constraints and machine properties generated by our methodology, we accurately reproduce performance metrics for benchmarks of Google compute clusters with a discrepancy of only 13% in task scheduling delay and 5% in resource utilization.

References

[1]

M. F. Arlitt and C. L. Williamson. Web server workload characterization: the search for invariants. In Proceedings of the ACM SIGMETRICS international conference on Measurement and modeling of computer systems, 1996.

Abstract

References

Cited By

Index Terms

Recommendations

Large System Performance of SPEC OMP2001 Benchmarks

Large System Performance of SPEC OMP2001 Benchmarks

A Benchmark Characterization of the EEMBC Benchmark Suite

Comments

Information

Published In

Sponsors

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Funding Sources

Conference

Acceptance Rates

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

Cited By

Login options

Full Access

View options

PDF

eReader

Figures

Other

Share

Share this Publication link

Share on social media

Affiliations