research-article

AutoPlacer: Scalable Self-Tuning Data Placement in Distributed Key-Value Stores

Authors:

Luís RodriguesAuthors Info & Claims

ACM Transactions on Autonomous and Adaptive Systems (TAAS), Volume 9, Issue 4

Article No.: 19, Pages 1 - 30

https://doi.org/10.1145/2641573

Published: 08 December 2014 Publication History

Abstract

This article addresses the problem of self-tuning the data placement in replicated key-value stores. The goal is to automatically optimize replica placement in a way that leverages locality patterns in data accesses, such that internode communication is minimized. To do this efficiently is extremely challenging, as one needs not only to find lightweight and scalable ways to identify the right assignment of data replicas to nodes but also to preserve fast data lookup. The article introduces new techniques that address these challenges. The first challenge is addressed by optimizing, in a decentralized way, the placement of the objects generating the largest number of remote operations for each node. The second challenge is addressed by combining the usage of consistent hashing with a novel data structure, which provides efficient probabilistic data placement. These techniques have been integrated in a popular open-source key-value store. The performance results show that the throughput of the optimized system can be six times better than a baseline system employing the widely used static placement based on consistent hashing.

References

[1]

M. Ahmad, B. Kemme, I. Brondino, M. Patiño-Martínez, and R. Jiménez-Peris. 2013. Transactional failure recovery for a distributed key-value store. In Proceedings of the 14th Middleware (Middleware'13). Springer, Berlin, China, 267--286.

[2]

P. Almeida, C. Baquero, N. Preguiça, and D. Hutchison. 2007. Scalable Bloom filters. Information Processing Letters 101, 6 (March 2007), 255--261.

Digital Library

[3]

C. Amza, A. Cox, and W. Zwaenepoel. 2003. Conflict-aware scheduling for dynamic content applications. In Proceedings of the 4th Conference on USENIX Symposium on Internet Technologies and Systems (USITS'03). USENIX Association, Berkeley, CA.

Digital Library

[4]

B. Ban and V. Blagojevic. 2002. Reliable Group Communication with JGroups 3.x. Technical Report. Red Hat, Inc. Retrieved from http://www.jgroups.org.

[5]

C. Bauer and G. King. 2006. Java Persistence with Hibernate. Manning Publications.

Digital Library

[6]

C. Bishop. 2006. Pattern Recognition and Machine Learning (Information Science and Statistics). Springer-Verlag, New York.

Digital Library

[7]

B. Bloom. 1970. Space/time trade-offs in hash coding with allowable errors. Communications of the ACM 13, 7 (July 1970), 422--426.

Digital Library

[8]

K. Chandy and J. Hewes. 1976. File allocation in distributed systems. In Proceedings of the ACM SIGMETRICS (SIGMETRICS'76). ACM, New York, 10--13.

Digital Library

[9]

F. Chang and others. 2008. Bigtable: A distributed storage system for structured data. ACM Transactions on Compututer Systems 26, 2 (June 2008), 4:1--4:26.

Digital Library

[10]

B. Chazelle, J. Kilian, R. Rubinfeld, and A. Tal. 2004. The Bloomier filter: An efficient data structure for static support lookup tables. In Proceedings of the 15th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA'04). Society for Industrial and Applied Mathematics.

Digital Library

[11]

H. Chen, M. Song, J. Song, A. Gavrilovska, and K. Schwan. 2011. HEaRS: A hierarchical energy-aware resource scheduler for virtualized data centers. In Proceedings of the International Conference on Cluster Computing (CLUSTER'11). IEEE, New York, 508--512.

Digital Library

[12]

N. Cook, D. Milojicic, and V. Talwar. 2012. Cloud management. Journal of Internet Services and Applications 3, 1 (2012), 67--75.

[13]

B. Cooper, R. Ramakrishnan, U. Srivastava, A. Silberstein, P. Bohannon, H.-A. Jacobsen, N. Puz, D. Weaver, and R. Yerneni. 2008. PNUTS: Yahoo!'s hosted data serving platform. In Proceedings of the 34th International Conference on Very Large Databases (VLDB'08). VLDB Endowment, Auckland, New Zealand.

Digital Library

[14]

B. Cooper, A. Silberstein, E. Tam, R. Ramakrishnan, and R. Sears. 2010. Benchmarking cloud serving systems with YCSB. In Proceedings of the 1st ACM Symposium on Cloud Computing (SoCC'10). ACM, New York, 143--154.

Digital Library

[15]

J. Corbett and others. 2012. Spanner: Google's globally-distributed database. In Proceedings of the 10th Symposium on Operating Systems Design and Implementation (OSDI'12). USENIX Association, Berkeley, CA, 251--264.

Digital Library

[16]

F. Cruz, F. Maia, M. Matos, R. Oliveira, J. Paulo, J. Pereira, and R. Vilaça. 2013. MeT: Workload aware elasticity for NoSQL. In Proceedings of the 8th ACM European Conference on Computer Systems (EuroSys'13). ACM, New York, 183--196.

Digital Library

[17]

C. Curino, E. Jones, Y. Zhang, and S. Madden. 2010. Schism: A workload-driven approach to database replication and partitioning. In Proceedings of the 36th International Conference on Very Large Databases (VLDB'10). VLDB Endowment, Singapore.

Digital Library

[18]

G. DeCandia and others. 2007. Dynamo: Amazon's highly available key-value store. In Proceedings of the 21st ACM SIGOPS Symposium on Operating Systems Principles (SOSP'07). ACM, New York, 205--220.

Digital Library

[19]

D. Didona, P. Romano, S. Peluso, and F. Quaglia. 2012. Transactional auto scaler: Elastic scaling of in-memory transactional data grids. In Proceedings of the 9th ACM International Conference on Autonomic Computing (ICAC'12). ACM, San Jose, CA, 125--134.

Digital Library

[20]

P. Domingos and G. Hulten. 2000. Mining high-speed data streams. In Proceedings of the 6th International Conference on Knowledge Discovery and Data Mining (SIGKDD'12). ACM, Boston, Massachusetts, USA.

Digital Library

[21]

L. Dowdy and D. Foster. 1982. Comparative models of the file assignment problem. ACM Computing Surveys 14, 2 (June 1982), 287--313.

Digital Library

[22]

B. Fleisch and G. Popek. 1989. Mirage: A coherent distributed shared memory design. In Proceedings of the 12th ACM Symposium on Operating Systems Principles (SOSP'89). ACM, New York, 211--223.

Digital Library

[23]

T. Forell, D. Milojicic, and V. Talwar. 2011. Cloud management: Challenges and opportunities. In IPDPS Workshops. IEEE, Los Alamitos, CA, 881--889.

Digital Library

[24]

S. Garbatov and J. Cachopo. 2011. Data access pattern analysis and prediction for object-oriented applications. INFOCOMP Journal of Computer Science 10, 4 (December 2011), 1--14.

[25]

Y. Jia, I. Brondino, R. Jiménez-Peris, M. Patiño Martínez, and D. Ma. 2013. A multi-resource load balancing algorithm for cloud cache systems. In Proceedings of the 28th Annual ACM Symposium on Applied Computing (SAC'13). ACM, New York, 463--470.

Digital Library

[26]

R. Jiménez-Peris, M. Patiño Martínez, and G. Alonso. 2002. Non-intrusive, parallel recovery of replicated data. In Proceedings of the 21st IEEE Symposium on Reliable Distributed Systems (SRDS'02). IEEE, Los Alamitos, CA, 150--159.

Digital Library

[27]

P. Krishnan, D. Raz, and Y. Shavitt. 2000. The cache location problem. IEEE/ACM Transactions on Networking 8, 5 (October 2000), 568--582.

Digital Library

[28]

L. Sangyeol and L. Taewook. 2004. CUSUM test for parameter change based on the maximum likelihood estimator. Sequential Analysis: Design Methods and Applications 23, 2 (2004), 239--256.

[29]

A. Lakshman and P. Malik. 2010. Cassandra: A decentralized structured storage system. SIGOPS Operating Systems Review 44, 2 (April 2010), 35--40.

Digital Library

[30]

N. Laoutaris, O. Telelis, V. Zissimopoulos, and I. Stavrakakis. 2006. Distributed selfish replication. IEEE Transactions on Parallel and Distributed Systems 17, 12 (December 2006), 1401--1413.

Digital Library

[31]

A. Leff, J. Wolf, and P. Yu. 1993. Replication algorithms in a remote caching architecture. IEEE Transactions on Parallel and Distributed Systems 4, 11 (November 1993), 1185--1204.

Digital Library

[32]

S. Leutenegger and D. Dias. 1993. A modeling study of the TPC-C benchmark. In Proceedings of the 1993 ACM SIGMOD International Conference on Management of Data (SIGMOD'93). ACM, New York, 22--31.

Digital Library

[33]

S. Li, T. Abdelzaher, and M. Yuan. 2011. TAPA: Temperature aware power allocation in data center with Map-Reduce. In Proceedings of the IGCC Workshops. 1--8.

Digital Library

[34]

S. Li, S. Wang, F. Yang, S. Hu, F. Saremi, and T. Abdelzaher. 2013. Proteus: Power proportional memory cache cluster in data centers. In Proceedings of the 33rd International Conference on Distributed Computing Systems (ICDCS'13). IEEE, New York, 73--82.

Digital Library

[35]

H. Liu and H. Motoda. 1998. Feature Selection for Knowledge Discovery and Data Mining. Kluwer Academic, Norwell, MA.

Digital Library

[36]

F. Marchioni and M. Surtani. 2012. Infinispan Data Grid Platform. PACKT Publishing.

[37]

A. Metwally, D. Agrawal, and A. El Abbadi. 2005. Efficient computation of frequent and top-k elements in data streams. In Proceedings of the 10th International Conference on Database Theory (ICDT'05). Springer-Verlag, 398--412.

Digital Library

[38]

T. Mitchell. 1997. Machine Learning. McGraw-Hill, New York.

Digital Library

[39]

A. Pavlo, C. Curino, and S. Zdonik. 2012. Skew-aware automatic database partitioning in shared-nothing, parallel OLTP systems. In Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data (SIGMOD'12). ACM, New York, 61--72.

Digital Library

[40]

S. Peluso, P. Romano, and F. Quaglia. 2012a. SCORe: A scalable one-copy serializable partial replication protocol. In Proceedings of the 13th Middleware (Middleware'12). Springer-Verlag, New York, 456--475.

Digital Library

[41]

S. Peluso, P. Ruivo, P. Romano, F. Quaglia, and L. Rodrigues. 2012b. When scalability meets consistency: Genuine multiversion update-serializable partial data replication. In Proceedings of the 32nd International Conference on Distributed Computing Systems (ICDCS'12). IEEE, Los Alamitos, CA, 455--465.

Digital Library

[42]

J. Ross Quinlan. 1993. C4.5: Programs for Machine Learning. Morgan Kaufmann, San Francisco, CA.

Digital Library

[43]

RedHat/JBoss. 2013. Non Blocking State Transfer V2. Retrieved from https://github.com/infinispan/infinispan/wiki/Non-Blocking-State-Transfer-V2.

[44]

P. Romano, M. Little, F. Quaglia, L. Rodrigues, and V. Ziparo. 2014. Cloud-TM: Transactional, Object-oriented, Self-tuning Cloud Data Store. Technical Report 7. INESC-ID.

[45]

P. Ruivo, M. Couceiro, P. Romano, and L. Rodrigues. 2011. Exploiting total order multicast in weakly consistent transactional caches. In Proceedings of the the 17th Pacific Rim International Symposium on Dependable Computing (PRDC'11). IEEE, Los Alamitos, CA.

Digital Library

[46]

A. L. Tatarowicz, C. Curino, E. Jones, and S. Madden. 2012. Lookup tables: Fine-grained partitioning for distributed databases. In Proceedings of the 28th International Conference on Data Engineering (ICDE'12). IEEE Computer Society, Washington, DC, 102--113.

Digital Library

[47]

R. Vilaça, R. Oliveira, and J. Pereira. 2011. A correlation-aware data placement strategy for key-value stores. In Proceedings of the 11th IFIP International Conference on Distributed Applications and Interoperable Systems (DAIS'11). Springer-Verlag, 214--227.

Digital Library

[48]

L. Wang, J. Xu, M. Zhao, and J. Fortes. 2011. Adaptive virtual resource management with fuzzy model predictive control. In Proceedings of the 8th ACM International Conference on Autonomic Computing (ICAC'11). ACM, New York, 191--192.

Digital Library

[49]

I. Witten and E. Frank. 2005. Data Mining: Practical Machine Learning Tools and Techniques, Second Edition (Morgan Kaufmann Series in Data Management Systems). Morgan Kaufmann, San Francisco, CA.

Digital Library

[50]

G.-Won You, S.-Won Hwang, and N. Jain. 2013. Ursa: Scalable load and power management in cloud storage systems. ACM Transactions on Storage 9, 1, Article 1 (March 2013), 29 pages.

Digital Library

[51]

S. Zaman and D. Grosu. 2011. A distributed algorithm for the replica placement problem. IEEE Transactions on Parallel and Distributed Systems 22, 9 (September 2011), 1455--1468.

Digital Library

[52]

V. Ziparo, F. Cottefoglie, D. Calisi, M. Zaratti, F. Giannone, and P. Romano. 2013. D4.3 - Prototype of pilot application I. In Cloud-TM Project. Retrieved from http://cloudtm.ist.utl.pt/.

Cited By

Jiang WQiu YJi FZhang YZhou XWang J(2022)AMS: Adaptive Multiget Scheduling Algorithm for Distributed Key-Value StoresIEEE Transactions on Cloud Computing10.1109/TCC.2022.3218582(1-12)Online publication date: 2022
https://doi.org/10.1109/TCC.2022.3218582
Szalay MMátray PToka L(2021)State Management for Cloud-Native ApplicationsElectronics10.3390/electronics1004042310:4(423)Online publication date: 9-Feb-2021
https://doi.org/10.3390/electronics10040423
Fortiș TFortiș A(2021)Cloud Computing Projects: A Bibliometric OverviewAdvanced Information Networking and Applications10.1007/978-3-030-75078-7_14(127-138)Online publication date: 1-May-2021
https://doi.org/10.1007/978-3-030-75078-7_14
Show More Cited By

Index Terms

AutoPlacer: Scalable Self-Tuning Data Placement in Distributed Key-Value Stores
1. Computer systems organization
  1. Architectures
    1. Distributed architectures
2. Software and its engineering
  1. Software organization and properties
    1. Software system structures
      1. Distributed systems organizing principles

Recommendations

A machine learning assisted data placement mechanism for hybrid storage systems
Abstract
Emerging applications produce massive files that show different properties in file size, lifetime, and read/write frequency. Existing hybrid storage systems place these files onto different storage mediums assuming that the access ...
Sibyl: adaptive and extensible data placement in hybrid storage systems using online reinforcement learning
ISCA '22: Proceedings of the 49th Annual International Symposium on Computer Architecture

Hybrid storage systems (HSS) use multiple different storage devices to provide high and scalable storage capacity at high performance. Data placement across different devices is critical to maximize the benefits of such a hybrid system. Recent research ...
A priority-based data placement method for databases using solid-state drives
RACS '18: Proceedings of the 2018 Conference on Research in Adaptive and Convergent Systems

When applications require high I/O performance, solid-state drives (SSDs) are often preferable because they perform better than traditional hard-disk drives (HDDs). Therefore, database system response time can be improved by moving frequently used data ...

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Autonomous and Adaptive Systems

ACM Transactions on Autonomous and Adaptive Systems Volume 9, Issue 4

January 2015

137 pages

ISSN:1556-4665

EISSN:1556-4703

DOI:10.1145/2695594

Editors:
Manish Parashar
Rutgers University, USA
,
Franco Zambonelli
University of Modena e Reggio Emilia, Italy

Issue’s Table of Contents

Copyright © 2014 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 08 December 2014

Accepted: 01 June 2014

Revised: 01 June 2014

Received: 01 January 2014

Published in TAAS Volume 9, Issue 4

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed

Funding Sources

Cloud-TM project (cofinanced by the European Commission through contract no. 257784)
Fundação para a Ciência e a Tecnologia

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

32
Total Citations
View Citations
519
Total Downloads

Downloads (Last 12 months)14
Downloads (Last 6 weeks)4

Reflects downloads up to 06 Nov 2024

Other Metrics

View Author Metrics

Citations

Cited By

Jiang WQiu YJi FZhang YZhou XWang J(2022)AMS: Adaptive Multiget Scheduling Algorithm for Distributed Key-Value StoresIEEE Transactions on Cloud Computing10.1109/TCC.2022.3218582(1-12)Online publication date: 2022
https://doi.org/10.1109/TCC.2022.3218582
Szalay MMátray PToka L(2021)State Management for Cloud-Native ApplicationsElectronics10.3390/electronics1004042310:4(423)Online publication date: 9-Feb-2021
https://doi.org/10.3390/electronics10040423
Fortiș TFortiș A(2021)Cloud Computing Projects: A Bibliometric OverviewAdvanced Information Networking and Applications10.1007/978-3-030-75078-7_14(127-138)Online publication date: 1-May-2021
https://doi.org/10.1007/978-3-030-75078-7_14
Chen YXiang XLing XZhang XWu FGao J(2020)Dynamic Load Balance for Hot-spot and Unbalance Region Problems in HBase2020 IEEE International Conference on Big Data (Big Data)10.1109/BigData50022.2020.9378465(2583-2589)Online publication date: 10-Dec-2020
https://doi.org/10.1109/BigData50022.2020.9378465
Yang WQin YYang Z(2020)A Reinforcement Learning Based Data Storage and Traffic Management in Information-Centric Data Center NetworksMobile Networks and Applications10.1007/s11036-020-01629-w27:1(266-275)Online publication date: 30-Jul-2020
https://doi.org/10.1007/s11036-020-01629-w
Wideł WAudinot MFila BPinchinat S(2019)Beyond 2014ACM Computing Surveys10.1145/333152452:4(1-36)Online publication date: 30-Aug-2019
https://dl.acm.org/doi/10.1145/3331524
Ma PWei KJiang C(2019)Evaluating Distributed Transactional Database SystemProceedings of the 11th International Conference on Computer Modeling and Simulation10.1145/3307363.3307364(203-207)Online publication date: 16-Jan-2019
https://dl.acm.org/doi/10.1145/3307363.3307364
Sun JLi YYang H(2019)HSPP: Load-Balanced and Low-Latency File Partition and Placement Strategy on Distributed Heterogeneous Storage with Erasure CodingAlgorithms and Architectures for Parallel Processing10.1007/978-3-030-38961-1_18(200-214)Online publication date: 9-Dec-2019
https://dl.acm.org/doi/10.1007/978-3-030-38961-1_18
Kathiravelu PVan Roy PVeiga L(2019)Interoperable and network‐aware service workflows for big data executions at internet scaleConcurrency and Computation: Practice and Experience10.1002/cpe.521232:21Online publication date: 27-Feb-2019
https://doi.org/10.1002/cpe.5212
Glasbergen BAbebe MDaudjee K(2018)TutorialProceedings of the 19th International Middleware Conference Tutorials10.1145/3279945.3279946(1-5)Online publication date: 10-Dec-2018
https://dl.acm.org/doi/10.1145/3279945.3279946
Show More Cited By

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Issue’s Table of Contents