skip to main content
10.1145/2390021.2390028acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
research-article

The Yahoo!: cloud datastore load balancer

Published: 29 October 2012 Publication History

Abstract

Sherpa is a large-scale distributed and globally replicated multi-tenant cloud data storage system. Sherpa scales by horizontally partitioning data into tablets and distributing these tablets across multiple servers. While Sherpa scales for increasing workload sizes by adding servers, it is vulnerable to load imbalance among tablets that cause hotspots to develop on just a few servers.
In this paper we describe Yak, the Sherpa load balancer. Yak detects hotspots and then automatically balances load by migrating tablets from the overloaded servers, and also by splitting data into new tablets. We describe Yak's design principles, algorithms and architecture. We then evaluate Yak on workloads based on Sherpa production scenarios.

References

[1]
E. Anderson, J. Hall, J. D. Hartline, M. Hobbs, A. R. Karlin, J. Saia, R. Swaminathan, and J. Wilkes. An experimental study of data migration algorithms. In Proceedings of the 5th International Workshop on Algorithm Engineering, WAE '01, pages 145--158, London, UK, 2001. Springer-Verlag.
[2]
E. Anderson, M. Hobbs, K. Keeton, S. Spence, M. Uysal, and A. Veitch. Hippodrome: Running circles around storage administration. In Proceedings of the 1st USENIX Conference on File and Storage Technologies, FAST '02, Berkeley, CA, USA, 2002. USENIX Association.
[3]
P. Bodík, A. Fox, M. J. Franklin, M. I. Jordan, and D. A. Patterson. Characterizing, modeling, and generating workload spikes for stateful services. In SoCC, pages 241--252, 2010.
[4]
F. Chang, J. Dean, S. Ghemawat, W. C. Hsieh, D. A. Wallach, M. Burrows, T. Chandra, A. Fikes, and R. E. Gruber. Bigtable: a distributed storage system for structured data. In Proceedings of the 7th USENIX Symposium on Operating Systems Design and Implementation - Volume 7, pages 15--15, Berkeley, CA, USA, 2006. USENIX Association.
[5]
B. F. Cooper, R. Ramakrishnan, U. Srivastava, A. Silberstein, P. Bohannon, H.-A. Jacobsen, N. Puz, D. Weaver, and R. Yerneni. Pnuts: Yahoo!'s hosted data serving platform. PVLDB, 1(2):1277--1288, 2008.
[6]
B. F. Cooper, A. Silberstein, E. Tam, R. Ramakrishnan, and R. Sears. Benchmarking Cloud Serving Systems with YCSB. In Proceedings of the 1st ACM Symposium on Cloud Computing, SoCC '10, pages 143--154, New York, NY, USA, 2010. ACM.
[7]
G. P. Copeland, W. Alexander, E. E. Boughter, and T. W. Keller. Data placement in bubba. In SIGMOD, pages 99--108, 1988.
[8]
G. DeCandia, D. Hastorun, M. Jampani, G. Kakulapati, A. Lakshman, A. Pilchin, S. Sivasubramanian, P. Vosshall, and W. Vogels. Dynamo: Amazon's highly available key-value store. In Proceedings of Twenty-First ACM SIGOPS Symposium on Operating Systems Principles, SOSP'07, pages 205--220, New York, NY, USA, 2007. ACM.
[9]
P. Ganesan, M. Bawa, and H. Garcia-Molina. Online balancing of range-partitioned data with applications to peer-to-peer systems. In Proceedings of the Thirtieth international conference on Very Large Data Bases - Volume 30, VLDB '04, pages 444--455. VLDB Endowment, 2004.
[10]
J. Hall, J. Hartline, A. R. Karlin, J. Saia, and J. Wilkes. On algorithms for efficient data migration. In Proceedings of the twelfth annual ACM-SIAM symposium on Discrete algorithms, SODA '01, pages 620--629, Philadelphia, PA, USA, 2001. Society for Industrial and Applied Mathematics.
[11]
D. R. Karger and M. Ruhl. Simple efficient load balancing algorithms for peer-to-peer systems. In Proceedings of the sixteenth annual ACM symposium on Parallelism in algorithms and architectures, SPAA '04, pages 36--43, New York, NY, USA, 2004. ACM.
[12]
D. Kunkle and J. Schindler. A load balancing framework for clustered storage systems. In Proceedings of the 15th international conference on High Performance Computing, HiPC'08, pages 57--72, Berlin, Heidelberg, 2008. Springer-Verlag.
[13]
A. Lakshman and P. Malik. Cassandra: structured storage system on a p2p network. In Proceedings of the 28th ACM symposium on Principles of distributed computing, PODC '09, pages 5--5, New York, NY, USA, 2009. ACM.
[14]
C. Lu, G. A. Alvarez, and J. Wilkes. Aqueduct: Online data migration with performance guarantees. In Proceedings of the 1st USENIX Conference on File and Storage Technologies, FAST '02, Berkeley, CA, USA, 2002. USENIX Association.
[15]
B. Trushkowsky, P. Bodík, A. Fox, M. J. Franklin, M. I. Jordan, and D. A. Patterson. The scads director: scaling a distributed storage system under stringent performance requirements. In Proceedings of the 9th USENIX conference on File and stroage technologies, FAST'11, pages 12--12, Berkeley, CA, USA, 2011. USENIX Association.
[16]
B. Urgaonkar, P. J. Shenoy, A. Chandra, and P. Goyal. Dynamic provisioning of multi-tier internet applications. In International Conference on Autonomic Computing (ICAC), pages 217--228, 2005.

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
CloudDB '12: Proceedings of the fourth international workshop on Cloud data management
October 2012
74 pages
ISBN:9781450317085
DOI:10.1145/2390021
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 29 October 2012

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. dynamic control
  2. dynamic load balancing
  3. elasticity
  4. scalability

Qualifiers

  • Research-article

Conference

CIKM'12
Sponsor:

Acceptance Rates

Overall Acceptance Rate 12 of 17 submissions, 71%

Upcoming Conference

CIKM '25

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)9
  • Downloads (Last 6 weeks)3
Reflects downloads up to 06 Jan 2025

Other Metrics

Citations

Cited By

View all

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media