skip to main content
10.1145/2745754.2745776acmconferencesArticle/Chapter ViewAbstractPublication PagespodsConference Proceedingsconference-collections
research-article

Joins via Geometric Resolutions: Worst-case and Beyond

Published: 20 May 2015 Publication History

Abstract

We present a simple geometric framework for the relational join. Using this framework, we design an algorithm that achieves the fractional hypertree-width bound, which generalizes classical and recent worst-case algorithmic results on computing joins. In addition, we use our framework and the same algorithm to show a series of what are colloquially known as beyond worst-case results. The framework allows us to prove results for data stored in Btrees, multidimensional data structures, and even multiple indices per table. A key idea in our framework is formalizing the inference one does with an index as a type of geometric resolution; transforming the algorithmic problem of computing joins to a geometric problem. Our notion of geometric resolution can be viewed as a geometric analog of logical resolution. In addition to the geometry and logic connections, our algorithm can also be thought of as backtracking search with memoization.

References

[1]
S. Abiteboul, R. Hull, and V. Vianu. Foundations of Databases. Addison-Wesley, 1995.
[2]
M. Abo Khamis, H. Q. Ngo, C. Ré, and A. Rudra. Joins via Geometric Resolutions: Worst-case and Beyond. ArXiv e-prints, Feb. 2015.
[3]
P. Afshani, J. Barbay, and T. M. Chan. Instance-optimal geometric algorithms. In FOCS, pages 129--138, 2009.
[4]
N. Alon. On the number of subgraphs of prescribed type of graphs with a given number of edges. Israel J. Math., 38(1--2):116--130, 1981.
[5]
S. Arnborg and A. Proskurowski. Linear time algorithms for NP-hard problems restricted to partial k-trees. Discrete Appl. Math., 23(1):11--24, 1989.
[6]
A. Atserias, M. Grohe, and D. Marx. Size bounds and query plans for relational joins. In FOCS, pages 739--748. IEEE Computer Society, 2008.
[7]
J. Barbay and C. Kenyon. Adaptive intersection and t-threshold problems. In SODA, pages 390--399, 2002.
[8]
J. Barbay and C. Kenyon. Alternation and redundancy analysis of the intersection problem. ACM Transactions on Algorithms, 4(1), 2008.
[9]
S. Blanas, Y. Li, and J. M. Patel. Design and evaluation of main memory hash join algorithms for multi-core CPUs. In SIGMOD, pages 37--48. ACM, 2011.
[10]
S. Chaudhuri. An overview of query optimization in relational systems. In PODS, pages 34--43. ACM, 1998.
[11]
C. Chekuri and A. Rajaraman. Conjunctive query containment revisited. Theor. Comput. Sci., 239(2):211--229, 2000.
[12]
R. Dechter and J. Pearl. Tree-clustering schemes for constraint-processing. In H. E. Shrobe, T. M. Mitchell, and R. G. Smith, editors, AAAI, pages 150--154. AAAI Press / The MIT Press, 1988.
[13]
R. Dechter and J. Pearl. Tree clustering for constraint networks. Artificial Intelligence, 38(3):353--366, 1989.
[14]
E. D. Demaine, A. López-Ortiz, and J. I. Munro. Adaptive set intersections, unions, and differences. In SODA, pages 743--752, 2000.
[15]
R. Fagin. Degrees of acyclicity for hypergraphs and relational database schemes. J. ACM, 30(3):514--550, 1983.
[16]
R. Fagin, A. Lotem, and M. Naor. Optimal aggregation algorithms for middleware. In Proceedings of the Twentieth ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems, PODS '01, pages 102--113, New York, NY, USA, 2001. ACM.
[17]
E. Friedgut and J. Kahn. On the number of copies of one hypergraph in another. Israel J. Math., 105:251--256, 1998.
[18]
A. Gajentaan and M. H. Overmars. On a class of o(n2) problems in computational geometry. Comput. Geom. Theory Appl., 45(4):140--152, May 2012.
[19]
G. Gottlob, N. Leone, and F. Scarcello. Robbers, marshals, and guards: game theoretic and logical characterizations of hypertree width. J. Comput. Syst. Sci., 66(4):775--808, 2003.
[20]
G. Graefe. Query evaluation techniques for large databases. ACM Computing Surveys, 25(2):73--170, June 1993.
[21]
M. Grohe and D. Marx. Constraint solving via fractional edge covers. In SODA, pages 289--298. ACM Press, 2006.
[22]
M. Gyssens, P. Jeavons, and D. A. Cohen. Decomposing constraint satisfaction problems using database techniques. Artif. Intell., 66(1):57--89, 1994.
[23]
M. Gyssens and J. Paredaens. A decomposition methodology for cyclic databases. In Advances in Data Base Theory, pages 85--122, 1982.
[24]
C. Kim, T. Kaldewey, V. W. Lee, E. Sedlar, A. D. Nguyen, N. Satish, J. Chhugani, A. Di Blas, and P. Dubey. Sort vs. hash revisited: fast join implementation on modern multi-core CPUs. Proc. VLDB Endow., 2(2):1378--1389, Aug. 2009.
[25]
P. G. Kolaitis and M. Y. Vardi. Conjunctive-query containment and constraint satisfaction. J. Comput. Syst. Sci., 61(2):302--332, 2000.
[26]
D. Maier. The Theory of Relational Databases. Computer Science Press, 1983.
[27]
D. Marx. Tractable hypergraph properties for constraint satisfaction and conjunctive queries. In STOC, pages 735--744, 2010.
[28]
D. Marx. Tractable structures for constraint satisfaction with truth tables. Theory Comput. Syst., 48(3):444--464, 2011.
[29]
D. Marx. Tractable hypergraph properties for constraint satisfaction and conjunctive queries. J. ACM, 60(6):42, 2013.
[30]
R. Milo, S. Shen-Orr, S. Itzkovitz, N. Kashtan, D. Chklovskii, and U. Alon. Network motifs: simple building blocks of complex networks. Science, 298(5594):824--827, October 2002.
[31]
H. Q. Ngo, D. T. Nguyen, C. Ré, and A. Rudra. Beyond worst-case analysis for joins with Minesweeper. In PODS, pages 234--245, 2014.
[32]
H. Q. Ngo, E. Porat, C. Ré, and A. Rudra. Worst-case optimal join algorithms: {extended abstract}. In PODS, pages 37--48, 2012.
[33]
H. Q. Ngo, C. Ré, and A. Rudra. Skew strikes back: New developments in the theory of join algorithms. In SIGMOD RECORD, pages 5--16, 2013.
[34]
D. Nguyen, M. Aref, M. Bravenboer, G. Kollias, H. Q. Ngo, C. Ré, and A. Rudra. Join Processing for Graph Patterns: An Old Dog with New Tricks. ArXiv e-prints, 2015.
[35]
D. Olteanu and J. Zavodny. Size bounds for factorised representations of query results. ACM Transactions on Database Systems, 2014. To appear.
[36]
C. H. Papadimitriou and M. Yannakakis. On the complexity of database queries. In PODS, pages 12--19, 1997.
[37]
N. Przulj, D. G. Corneil, and I. Jurisica. Modeling interactome: scale-free or geometric? Bioinformatics, 20(18):3508--3515, 2004.
[38]
N. Robertson and P. D. Seymour. Graph minors. II. Algorithmic aspects of tree-width. J. Algorithms, 7(3):309--322, 1986.
[39]
F. Scarcello. Query answering exploiting structural properties. SIGMOD Record, 34(3):91--99, 2005.
[40]
S. Suri and S. Vassilvitskii. Counting triangles and the curse of the last reducer. In WWW, pages 607--614, 2011.
[41]
C. E. Tsourakakis. Fast counting of triangles in large real networks without counting: Algorithms and laws. In ICDM, pages 608--617. IEEE Computer Society, 2008.
[42]
J. D. Ullman. Principles of Database and Knowledge-Base Systems, Volume II. Computer Science Press, 1989.
[43]
T. L. Veldhuizen. Triejoin: A simple, worst-case optimal join algorithm. In ICDT, pages 96--106, 2014.
[44]
M. Yannakakis. Algorithms for acyclic database schemes. In VLDB, pages 82--94, 1981.

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
PODS '15: Proceedings of the 34th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems
May 2015
358 pages
ISBN:9781450327572
DOI:10.1145/2745754
  • General Chair:
  • Tova Milo,
  • Program Chair:
  • Diego Calvanese
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 20 May 2015

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. beyond worst-case analysis
  2. bounded-width join queries
  3. indices
  4. relational join
  5. resolution

Qualifiers

  • Research-article

Funding Sources

Conference

SIGMOD/PODS'15
Sponsor:
SIGMOD/PODS'15: International Conference on Management of Data
May 31 - June 4, 2015
Victoria, Melbourne, Australia

Acceptance Rates

PODS '15 Paper Acceptance Rate 25 of 80 submissions, 31%;
Overall Acceptance Rate 642 of 2,707 submissions, 24%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)2
  • Downloads (Last 6 weeks)0
Reflects downloads up to 30 Oct 2024

Other Metrics

Citations

Cited By

View all

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media