skip to main content
10.1145/2815400.2815413acmconferencesArticle/Chapter ViewAbstractPublication PagessospConference Proceedingsconference-collections
research-article

Yesquel: scalable sql storage for web applications

Published: 04 October 2015 Publication History

Abstract

Web applications have been shifting their storage systems from sql to nosql systems. nosql systems scale well but drop many convenient sql features, such as joins, secondary indexes, and/or transactions. We design, develop, and evaluate Yesquel, a system that provides performance and scalability comparable to nosql with all the features of a sql relational system. Yesquel has a new architecture and a new distributed data structure, called YDBT, which Yesquel uses for storage, and which performs well under contention by many concurrent clients. We evaluate Yesquel and find that Yesquel performs almost as well as Redis---a popular nosql system---and much better than mysql Cluster, while handling sql queries at scale.

Supplementary Material

MP4 File (p245.mp4)

References

[1]
Adya, A., Gruber, R., Liskov, B., and Maheshwari, U. Efficient optimistic concurrency control using loosely synchronized clocks. In International Conference on Management of Data (May 1995), pp. 23--34.
[2]
Aguilera, M. K., Golab, W., and Shah, M. A practical scalable distributed B-tree. Proceedings of the VLDB Endowment 1, 1 (Aug. 2008), 598--609.
[3]
Aguilera, M. K., Leners, J. B., Kotla, R., and Walfish, M. Yesquel: Scalable SQL storage for Web applications. In International Conference on Distributed Computing and Networking (Jan. 2015). Invited keynote presentation.
[4]
Aguilera, M. K., Leners, J. B., and Walfish, M. Distributed SQL query processing using key-value storage system, Dec. 2012. United States Patent Application 20140172898, filed 13 December 2012.
[5]
Aguilera, M. K., Merchant, A., Shah, M., Veitch, A., and Karamanolis, C. Sinfonia: A new paradigm for building scalable distributed systems. ACM Transactions on Computer Systems 27, 3 (Nov. 2009), 5:1--5:48.
[6]
Alsberg, P. A., and Day, J. D. A principle for resilient sharing of distributed resources. In International Conference on Software Engineering (Oct. 1976), pp. 562--570.
[7]
Aspnes, J., and Shah, G. Skip graphs. ACM Transactions on Algorithms 3, 4 (Nov. 2007), 37.
[8]
Berenson, H., et al. A critique of ANSI SQL isolation levels. In International Conference on Management of Data (May 1995), pp. 1--10.
[9]
Bernstein, P. A., Hadzilacos, V., and Goodman, N. Concurrency Control and Recovery in Database Systems. Addison-Wesley, 1987.
[10]
Charron-Bost, B., Pedone, F., and Schiper, A., Eds. Replication: Theory and Practice. Springer, 2010.
[11]
Corbett, J. C., et al. Spanner: Google's globally-distributed database. In Symposium on Operating Systems Design and Implementation (Oct. 2012), pp. 251--264.
[12]
Dean, J., and Ghemawat, S. MapReduce: Simplified data processing on large clusters. In Symposium on Operating Systems Design and Implementation (Dec. 2004), pp. 137--150.
[13]
Diaconu, C., Freedman, C., Ismert, E., Larson, P.-A., Mittal, P., Stonecipher, R., Verma, N., and Zwilling, M. Hekaton: SQL Server's memory-optimized OLTP engine. In International Conference on Management of Data (June 2013), pp. 1243--1254.
[14]
https://www.mapr.com/products/apache-drill.
[15]
Du, J., Elnikety, S., and Zwaenepoel, W. Clock-SI: Snapshot isolation for partitioned data stores using loosely synchronized clocks. In IEEE Symposium on Reliable Distributed Systems (Sept. 2013), pp. 173--184.
[16]
Escriva, R., Wong, B., and Sirer, E. G. HyperDex: A distributed, searchable key-value store for cloud computing. In ACM SIGCOMM Conference on Applications, Technologies, Architectures, and Protocols for Computer Communications (Aug. 2012), pp. 25--36.
[17]
Eswaran, K. P., Gray, J. N., Lorie, R. A., and Traiger, I. L. The notions of consistency and predicate locks in a database system. Commun. ACM 19, 11 (Nov. 1976), 624--633.
[18]
Floratou, A., Minhas, U. F., and Özcan, F. SQL-on-Hadoop: Full circle back to shared-nothing database architectures. Proceedings of the VLDB Endowment 7, 12 (Aug. 2014), 1295--1306.
[19]
http://foundationdb.com.
[20]
Friedman, E., Pawlowski, P., and Cieslewicz, J. sql/MapReduce: A practical approach to self-describing, polymorphic, and parallelizable user-defined functions. Proceedings of the VLDB Endowment 2, 2 (Aug. 2009), 1402--1413.
[21]
Goel, A. K., Pound, J., Auch, N., Bumbulis, P., MacLean, S., Färber, F., Gropengiesser, F., Mathis, C., Bodner, T., and Lehner, W. Towards scalable real-time analytics: An architecture for scale-out of OLxP workloads. Proceedings of the VLDB Endowment 8, 12 (Aug. 2015), 1716--1727.
[22]
Graefe, G. Write-optimized B-trees. In International Conference on Very Large Data Bases (Aug. 2004), pp. 672--683.
[23]
Gray, J., Helland, P., O'Neil, P., and Shasha, D. The dangers of replication and a solution. In International Conference on Management of Data (June 1996), pp. 173--182.
[24]
Gray, J., and Reuter, A. Transaction processing: concepts and techniques. Morgan Kaufmann Publishers, 1993.
[25]
Gribble, S. D., Brewer, E. A., Hellerstein, J. M., and Culler, D. Scalable, distributed data structures for Internet service construction. In Symposium on Operating Systems Design and Implementation (Oct. 2000), pp. 319--332.
[26]
Gupta, A., et al. Mesa: Geo-replicated, near real-time, scalable data warehousing. Proceedings of the VLDB Endowment 7, 12 (Aug. 2014), 1259--1270.
[27]
http://hadoop.apache.org.
[28]
http://hbase.apache.org.
[29]
Hellerstein, J. M., Stonebraker, M., and Hamilton, J. Architecture of a database system. Foundations and Trends in Databases 1, 2 (Feb. 2007), 141--259.
[30]
Kallman, R., et al. H-store: a high-performance, distributed main memory transaction processing system. Proceedings of the VLDB Endowment 1, 2 (Aug. 2008), 1496--1499.
[31]
Kate, B., Kohler, E., Kester, M. S., Narula, N., Mao, Y., and Morris, R. Easy freshness with Pequod cache joins. In Symposium on Networked Systems Design and Implementation (Apr. 2014), pp. 415--428.
[32]
Kornacker, M., et al. Impala: A modern, open-source SQL engine for Hadoop. In Conference on Innovative Data Systems Research (Jan. 2015).
[33]
Kung, H. T., and Lehman, P. L. Concurrent manipulation of binary search trees. ACM Transactions on Database Systems 5, 3 (Sept. 1980), 354--382.
[34]
http://en.wikipedia.org/wiki/LAMP (software bundle).
[35]
Lamport, L. The part-time parliament. ACM Transactions on Computer Systems 16, 2 (May 1998), 133--169.
[36]
Lehman, P. L., and Yao, S. B. Efficient locking for concurrent operations on B-trees. ACM Transactions on Database Systems 6, 4 (Dec. 1981), 650--670.
[37]
Levandoski, J. J., Lomet, D., Mokbel, M. F., and Zhao, K. K. Deuteronomy: Transaction support for cloud data. In Conference on Innovative Data Systems Research (Jan. 2011), pp. 123--133.
[38]
Levin, K. D., and Morgan, H. L. Optimizing distributed data bases: a framework for research. In National computer conference (May 1975), pp. 473--478.
[39]
Liskov, B. Practical uses of synchronized clocks in distributed systems. Distributed Computing 6, 4 (July 1993), 211--219.
[40]
Loesing, S., Pilman, M., Etter, T., and Kossmann, D. On the design and scalability of distributed shared-data databases. In International Conference on Management of Data (May 2015), pp. 663--676.
[41]
Lomet, D. B., Sengupta, S., and Levandoski, J. J. The Bw-tree: A B-tree for new hardware platforms. In International Conference on Data Engineering (Apr. 2013), pp. 302--313.
[42]
MacCormick, J., Murphy, N., Najork, M., Thekkath, C. A., and Zhou, L. Boxwood: Abstractions as the foundation for storage infrastructure. In Symposium on Operating Systems Design and Implementation (Dec. 2004), pp. 105--120.
[43]
Manolopoulos, Y. B-trees with lazy parent split. Information Sciences 79, 1-2 (July 1994), 73--88.
[44]
http://www.mediawiki.org.
[45]
Melnik, S., Gubarev, A., Long, J. J., Romer, G., Shivakumar, S., Tolton, M., and Vassilakis, T. Dremel: Interactive analysis of web-scale datasets. Proceedings of the VLDB Endowment 3, 1-2 (Sept. 2010), 330--339.
[46]
http://memcached.org.
[47]
Mohan, C. Big data: Hype and reality. http://bit.ly/CMnMDS.
[48]
http://www.mysql.com.
[49]
Narula, N., and Morris, R. Executing Web application queries on a partitioned database. In USENIX Conference on Web Application Development (June 2012), pp. 63--74.
[50]
Nielsen, J. Usability Engineering. Morgan Kaufmann, San Francisco, 1994.
[51]
Pavlo, A., Curino, C., and Zdonik, S. B. Skew-aware automatic database partitioning in shared-nothing, parallel OLTP systems. In International Conference on Management of Data (May 2012), pp. 61--72.
[52]
Ports, D. R. K., Clements, A. T., Zhang, I., Madden, S., and Liskov, B. Transactional consistency and automatic management in an application data cache. In Symposium on Operating Systems Design and Implementation (Oct. 2010), pp. 279--292.
[53]
http://prestodb.io.
[54]
Rabkin, A., Arye, M., Sen, S., Pai, V. S., and Freedman, M. J. Aggregation and degradation in JetStream: Streaming analytics in the wide area. In Symposium on Networked Systems Design and Implementation (Apr. 2014), pp. 275--288.
[55]
Rae, I., Rollins, E., Shute, J., Sodhi, S., and Vingralek, R. Online, asynchronous schema change in F1. Proceedings of the VLDB Endowment 6, 11 (Aug. 2013), 1045--1056.
[56]
Reed, D. P. Implementing atomic actions on decentralized data. ACM Transactions on Computer Systems 1, 1 (Feb. 1983), 3--23.
[57]
http://www.scalearc.com.
[58]
Sewall, J., Chhugani, J., Kim, C., Satish, N., and Dubey, P. PALM: Parallel architecture-friendly latch-free modifications to B+ trees on many-core processors. Proceedings of the VLDB Endowment 4, 11 (Aug. 2011), 795--806.
[59]
Shasha, D., and Goodman, N. Concurrent search structure algorithms. ACM Transactions on Database Systems 13, 1 (Mar. 1988), 53--90.
[60]
Shute, J., et al. F1: A distributed SQL database that scales. Proceedings of the VLDB Endowment 6, 11 (Aug. 2013), 1068--1079.
[61]
Sowell, B., Golab, W. M., and SHAH, M. A. Minuet: A scalable distributed multiversion B-tree. Proceedings of the VLDB Endowment 5, 9 (May 2012), 884--895.
[62]
http://www.sqlite.org.
[63]
Stonebraker, M. The case for shared nothing. IEEE Database Engineering Bulletin 9, 1 (Mar. 1986), 4--9.
[64]
Stonebraker, M., Madden, S., Abadi, D. J., Harizopoulos, S., Hachem, N., and Helland, P. The end of an architectural era (it's time for a complete rewrite). In International Conference on Very Large Data Bases (Sept. 2007), pp. 1150--1160.
[65]
http://tajo.incubator.apache.org.
[66]
Terry, D., Prabhakaran, V., Kotla, R., Balakrishnan, M., and Aguilera, M. K. Transactions with consistency choices on geo-replicated cloud storage. Tech. Rep. MSR-TR-2013-82, Microsoft Research, Sept. 2013.
[67]
Tomic, A. MoSQL, A Relational Database Using NoSQL Technology. PhD thesis, Faculty of Informatics, University of Lugano, 2011.
[68]
Tomic, A., Sciascia, D., and Pedone, F. MoSQL: An elastic storage engine for MySQL. In Symposium On Applied Computing (Mar. 2013), pp. 455--462.
[69]
http://www.wikipedia.org.
[70]
Xin, R. S., Rosen, J., Zaharia, M., Franklin, M. J., Shenker, S., and Stoica, I. Shark: SQL and rich analytics at scale. In International Conference on Management of Data (June 2013), pp. 13--24.

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
SOSP '15: Proceedings of the 25th Symposium on Operating Systems Principles
October 2015
499 pages
ISBN:9781450338349
DOI:10.1145/2815400
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 04 October 2015

Permissions

Request permissions for this article.

Check for updates

Qualifiers

  • Research-article

Conference

SOSP '15
Sponsor:

Acceptance Rates

SOSP '15 Paper Acceptance Rate 30 of 181 submissions, 17%;
Overall Acceptance Rate 174 of 961 submissions, 18%

Upcoming Conference

SOSP '25
ACM SIGOPS 31st Symposium on Operating Systems Principles
October 13 - 16, 2025
Seoul , Republic of Korea

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)41
  • Downloads (Last 6 weeks)3
Reflects downloads up to 05 Jan 2025

Other Metrics

Citations

Cited By

View all

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media