skip to main content
10.1145/2588555.2610502acmconferencesArticle/Chapter ViewAbstractPublication PagesmodConference Proceedingsconference-collections
research-article

H2O: a hands-free adaptive store

Published: 18 June 2014 Publication History

Abstract

Modern state-of-the-art database systems are designed around a single data storage layout. This is a fixed decision that drives the whole architectural design of a database system, i.e., row-stores, column-stores. However, none of those choices is a universally good solution; different workloads require different storage layouts and data access methods in order to achieve good performance.
In this paper, we present the H2O system which introduces two novel concepts. First, it is flexible to support multiple storage layouts and data access patterns in a single engine. Second, and most importantly, it decides on-the-fly, i.e., during query processing, which design is best for classes of queries and the respective data parts. At any given point in time, parts of the data might be materialized in various patterns purely depending on the query workload; as the workload changes and with every single query, the storage and access patterns continuously adapt. In this way, H2O makes no a priori and fixed decisions on how data should be stored, allowing each single query to enjoy a storage and access pattern which is tailored to its specific properties.
We present a detailed analysis of H2O using both synthetic benchmarks and realistic scientific workloads. We demonstrate that while existing systems cannot achieve maximum performance across all workloads, H2O can always match the best case performance without requiring any tuning or workload knowledge.

References

[1]
D. Abadi, P. Boncz, S. Harizopoulos, S. Idreos, and S. Madden. The design and implementation of modern column-oriented database systems. Foundations and Trends in Databases, 5(3):197--280, 2013.
[2]
D. Abadi, S. Madden, and N. Hachem. Column-stores vs. row-stores: how different are they really? In SIGMOD, 2008.
[3]
S. Agrawal, V. Narasayya, and B. Yang. Integrating vertical and horizontal partitioning into automated physical database design. In SIGMOD, 2004.
[4]
A. Ailamaki, D. DeWitt, M. Hill, and M. Skounakis. Weaving relations for cache performance. In VLDB, 2001.
[5]
A. Ailamaki, D. DeWitt, M. Hill, and D. Wood. DBMSs on a modern processor: Where does time go? In VLDB, 1999.
[6]
I. Alagiannis, R. Borovica, M. Branco, S. Idreos, and A. Ailamaki. NoDB: efficient query execution on raw data files. In SIGMOD, 2012.
[7]
P. Boncz, M. Zukowski, and N. Nes. MonetDB/X100: Hyper-pipelining query execution. In CIDR, 2005.
[8]
N. Bruno and S. Chaudhuri. Automatic physical database tuning: A relaxation-based approach. In SIGMOD, 2005.
[9]
D. Chamberlin et al. A history and evaluation of System R. Commun. ACM, 24(10):632--646, 1981.
[10]
P. Cudré-Mauroux, E. Wu, and S. Madden. The case for RodentStore: An adaptive, declarative storage system. In CIDR, 2009.
[11]
J. Dittrich and A. Jindal. Towards a one size fits all database architecture. In CIDR, 2011.
[12]
F. Farber et al. SAP HANA database: data management for modern business applications. SIGMOD Record, 40(4):45--51, 2011.
[13]
G. Graefe, F. Halim, S. Idreos, H. A. Kuno, and S. Manegold. Concurrency control for adaptive indexing. PVLDB, 5(7):656--667, 2012.
[14]
G. Graefe, F. Halim, S. Idreos, H. A. Kuno, S. Manegold, and B. Seeger. Transactional support for adaptive indexing. VLDB J., 23(2):303--328, 2014.
[15]
G. Graefe and H. Kuno. Self-selecting, self-tuning, incrementally optimized indexes. In EDBT, 2010.
[16]
M. Grund, J. Krüger, H. Plattner, A. Zeier, P. Cudré-Mauroux, and S. Madden. HYRISE - a main memory hybrid storage engine. PVLDB, 4(2):105--116, 2010.
[17]
F. Halim, S. Idreos, P. Karras, and R. Yap. Stochastic database cracking: Towards robust adaptive indexing in main-memory column-stores. PVLDB, 5(6):502--513, 2012.
[18]
R. Hankins and J. Patel. Data morphing: An adaptive, cache-conscious storage technique. In VLDB, 2003.
[19]
S. Harizopoulos, V. Liang, D. Abadi, and S. Madden. Performance tradeoffs in read-optimized databases. In VLDB, 2006.
[20]
J. Hellerstein, M. Stonebraker, and J. R. Hamilton. Architecture of a database system. Foundations and Trends in Databases, 1(2):141--259, 2007.
[21]
M. Hirzel et al. IBM streams processing language: Analyzing big data in motion. IBM Journal of Research and Development, 57(3/4):7, 2013.
[22]
S. Idreos, I. Alagiannis, R. Johnson, and A. Ailamaki. Here are my data files. Here are my queries. Where are my results? In CIDR, 2011.
[23]
S. Idreos, M. L. Kersten, and S. Manegold. Database cracking. In CIDR, 2007.
[24]
S. Idreos, M. L. Kersten, and S. Manegold. Updating a cracked database. In SIGMOD, 2007.
[25]
S. Idreos, M. L. Kersten, and S. Manegold. Self-organizing tuple reconstruction in column-stores. In SIGMOD, 2009.
[26]
S. Idreos and E. Liarou. dbTouch: Analytics at your fingertips. In CIDR, 2013.
[27]
S. Idreos, S. Manegold, H. Kuno, and G. Graefe. Merging what's cracked, cracking what's merged: adaptive indexing in main-memory column-stores. PVLDB, 4(9), 2011.
[28]
A. Jindal and J. Dittrich. Relax and let the database do the partitioning online. In BIRTE, 2011.
[29]
A. Kemper and T. Neumann. Hyper: A hybrid OLTP&OLAP main memory database system based on virtual memory snapshots. In ICDE, 2011.
[30]
M. L. Kersten, S. Idreos, S. Manegold, and E. Liarou. The researcher's guide to the data deluge: Querying a scientific database in just a few seconds. PVLDB, 4(12):1474--1477, 2011.
[31]
K. Krikellas, S. Viglas, and M. Cintra. Generating code for holistic query evaluation. In ICDE, 2010.
[32]
A. Lamb et al. The Vertica analytic database: C-Store 7 years later. PVLDB, 5(12):1790--1801, 2012.
[33]
P.-Å. Larson et al. Enhancements to SQL Server column stores. In SIGMOD, 2013.
[34]
C. Lattner and V. Adve. LLVM: A compilation framework for lifelong program analysis & transformation. In CGO, 2004.
[35]
R. MacNicol and B. French. Sybase IQ Multiplex - designed for analytics. In VLDB, 2004.
[36]
S. Manegold, P. Boncz, and M. Kersten. Optimizing database architecture for the new bottleneck: memory access. VLDB J., 9(3):231--246, 2000.
[37]
A. Nandi and H. V.Jagadish. Guided interaction: Rethinking the query-result paradigm. In VLDB, 2011.
[38]
S. Navathe, S. Ceri, G. Wiederhold, and J. Dou. Vertical partitioning algorithms for database design. ACM Trans. Database Syst., 9(4):680--710, 1984.
[39]
T. Neumann. Efficiently compiling efficient query plans for modern hardware. PVLDB, 4(9):539--550, 2011.
[40]
S. Padmanabhan, T. Malkemus, R. Agarwal, and A. Jhingran. Block oriented processing of relational database operations in modern computer architectures. In ICDE, 2001.
[41]
S. Papadomanolakis and A. Ailamaki. AutoPart: Automating schema design for large scientific databases using data partitioning. In SSDBM, 2004.
[42]
H. Pirk et al. CPU and cache efficient management of memory-resident databases. In ICDE, 2013.
[43]
R. Ramamurthy, D. DeWitt, and Q. Su. A case for fractured mirrors. VLDB J., 12(2):89--101, 2003.
[44]
V. Raman et al. Constant-time query processing. In ICDE, 2008.
[45]
V. Raman et al. DB2 with BLU acceleration: So much more than just a column store. PVLDB, 6(11):1080--1091, 2013.
[46]
J. Rao, H. Pirahesh, C. Mohan, and G. M. Lohman. Compiled query execution engine using JVM. In ICDE, 2006.
[47]
P. Rösch, L. Dannecker, G. Hackenbroich, and F. Faerber. A storage advisor for hybrid-store databases. PVLDB, 5(12):1748--1758, 2012.
[48]
D. Saccà and G. Wiederhold. Database partitioning in a cluster of processors. ACM Trans. Database Syst., 10(1):29--56, 1985.
[49]
K. Schnaitter, S. Abiteboul, T. Milo, and N. Polyzotis. COLT: continuous on-line tuning. In SIGMOD, 2006.
[50]
F. M. Schuhknecht, A. Jindal, and J. Dittrich. The Uncracked Pieces in Database Cracking. PVLDB, 7(2), 2013.
[51]
J. Sompolski, M. Zukowski, and P. Boncz. Vectorization vs. compilation in query execution. In DaMoN, 2011.
[52]
M. Stonebraker and U. Çetintemel. "One size fits all": An idea whose time has come and gone. In ICDE, 2005.
[53]
J. Zhou and K. Ross. A multi-resolution block storage model for database design. In IDEAS, 2003.
[54]
M. Zukowski and P. Boncz. Vectorwise: Beyond column stores. IEEE Data Eng. Bull., 35(1):21--27, 2012.
[55]
M. Zukowski, N. Nes, and P. Boncz. DSM vs. NSM: CPU performance tradeoffs in block-oriented query processing. In DaMoN, pages 47--54, 2008.

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
SIGMOD '14: Proceedings of the 2014 ACM SIGMOD International Conference on Management of Data
June 2014
1645 pages
ISBN:9781450323765
DOI:10.1145/2588555
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 18 June 2014

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. adaptive hybrids
  2. adaptive storage
  3. dynamic operators

Qualifiers

  • Research-article

Funding Sources

Conference

SIGMOD/PODS'14
Sponsor:

Acceptance Rates

SIGMOD '14 Paper Acceptance Rate 107 of 421 submissions, 25%;
Overall Acceptance Rate 785 of 4,003 submissions, 20%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)40
  • Downloads (Last 6 weeks)4
Reflects downloads up to 06 Nov 2024

Other Metrics

Citations

Cited By

View all

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media