skip to main content
10.1145/3375395.3387662acmconferencesArticle/Chapter ViewAbstractPublication PagespodsConference Proceedingsconference-collections
research-article

Answering (Unions of) Conjunctive Queries using Random Access and Random-Order Enumeration

Published: 14 June 2020 Publication History

Abstract

As data analytics becomes more crucial to digital systems, so grows the importance of characterizing the database queries that admit a more efficient evaluation. We consider the tractability yardstick of answer enumeration with a polylogarithmic delay after a linear-time preprocessing phase. Such an evaluation is obtained by constructing, in the preprocessing phase, a data structure that supports polylogarithmic-delay enumeration. In this paper, we seek a structure that supports the more demanding task of a "random permutation": polylogarithmic-delay enumeration in truly random order. Enumeration of this kind is required if downstream applications assume that the intermediate results are representative of the whole result set in a statistically valuable manner. An even more demanding task is that of a "random access": polylogarithmic-time retrieval of an answer whose position is given. We establish that the free-connex acyclic CQs are tractable in all three senses: enumeration, random-order enumeration, and random access; and in the absence of self-joins, it follows from past results that every other CQ is intractable by each of the three (under some fine-grained complexity assumptions). However, the three yardsticks are separated in the case of a union of CQs (UCQ): while a union of free-connex acyclic CQs has a tractable enumeration, it may (provably) admit no random access. For such UCQs we devise a random-order enumeration whose delay is logarithmic in expectation. We also identify a subclass of UCQs for which we can provide random access with polylogarithmic access time. Finally, we present an implementation and an empirical study that show a considerable practical superiority of our random-order enumeration approach over state-of-the-art alternatives.

References

[1]
A. Abboud and V. V. Williams. Popular conjectures imply strong lower bounds for dynamic problems. In FOCS, pages 434--443, 2014.
[2]
S. Acharya, P. B. Gibbons, V. Poosala, and S. Ramaswamy. Join synopses for approximate query answering. In SIGMOD, pages 275--286. ACM Press, 1999.
[3]
N. Alon, R. Yuster, and U. Zwick. Finding and counting given length cycles. Algorithmica, 17(3):209--223, 1997.
[4]
R. R. Amossen and R. Pagh. Faster join-projects and sparse matrix multiplications. In ICDT, pages 121--126, 2009.
[5]
A. Atserias, M. Grohe, and D. Marx. Size bounds and query plans for relational joins. SIAM J. Comput., 42(4):1737--1767, 2013.
[6]
G. Bagan, A. Durand, and E. Grandjean. On acyclic conjunctive queries and constant delay enumeration. In CSL, pages 208--222. Springer, 2007.
[7]
C. Berkholz, J. Keppeler, and N. Schweikardt. Answering UCQs under updates and in the presence of integrity constraints. In ICDT, pages 1--19, 2018.
[8]
J. Brault-Baron. De la pertinence de l'énumération: complexité en logiques propositionnelle et du premier ordre. PhD thesis, Université de Caen, 2013.
[9]
F. Capelli and Y. Strozecki. Incremental delay enumeration: Space and time. Discrete Applied Mathematics, 268:179--190, 2019.
[10]
N. Carmeli and M. Krö ll. Enumeration complexity of conjunctive queries with functional dependencies. In ICDT, pages 11:1--11:17, 2018.
[11]
N. Carmeli and M. Kröll. On the enumeration complexity of unions of conjunctive queries. In PODS, PODS '19, pages 134--148, New York, NY, USA, 2019. ACM.
[12]
S. Chaudhuri, R. Motwani, and V. R. Narasayya. On random sampling over joins. In SIGMOD, pages 263--274. ACM Press, 1999.
[13]
Y. Chen and K. Yi. Random sampling and size estimation over cyclic joins. In C. Lutz and J. C. Jung, editors, 23rd International Conference on Database Theory, ICDT 2020, March 30-April 2, 2020, Copenhagen, Denmark, volume 155 of LIPIcs, pages 7:1--7:18. Schloss Dagstuhl - Leibniz-Zentrum fü r Informatik, 2020.
[14]
S. Deep and P. Koutris. Ranked enumeration of conjunctive query results. CoRR, abs/1902.02698, 2019.
[15]
A. Durand and Y. Strozecki. Enumeration complexity of logical query problems with second-order variables. In CSL, volume 12 of LIPIcs, pages 189--202, 2011.
[16]
R. Durstenfeld. Algorithm 235: Random permutation. C. ACM, 7(7):420, 1964.
[17]
J. Flum, M. Frick, and M. Grohe. Query evaluation via tree-decompositions. J. ACM, 49(6):716--752, 2002.
[18]
F. L. Gall. Powers of tensors and fast matrix multiplication. In ISSAC, pages 296--303, 2014.
[19]
K. Golenberg, B. Kimelfeld, and Y. Sagiv. Keyword proximity search in complex data graphs. In SIGMOD, pages 927--940. ACM, 2008.
[20]
P. J. Haas and J. M. Hellerstein. Ripple joins for online aggregation. In SIGMOD, pages 287--298. ACM Press, 1999.
[21]
V. Hristidis and Y. Papakonstantinou. DISCOVER: keyword search in relational databases. In VLDB, pages 670--681. Morgan Kaufmann, 2002.
[22]
M. Idris, M. Ugarte, and S. Vansummeren. The Dynamic Yannakakis algorithm: Compact and efficient query processing under updates. In SIGMOD, pages 1259--1274. ACM, 2017.
[23]
R. M. Karp, M. Luby, and N. Madras. Monte-carlo approximation algorithms for enumeration problems. Journal of Algorithms, 10(3):429 -- 448, 1989.
[24]
F. Li, B. Wu, K. Yi, and Z. Zhao. Wander join and XDB: online aggregation via random walks. ACM Trans. Database Syst., 44(1):2:1--2:41, 2019.
[25]
A. Lincoln, V. V. Williams, and R. Williams. Tight hardness for shortest cycles and paths in sparse graphs. In Proc. SODA, pages 1236--1252. SIAM, 2018.
[26]
B. M. E. Moret and H. D. Shapiro. Algorithms from P to NP: Volume 1: Design & Efficiency. Benjamin-Cummings, 1991.
[27]
D. Olteanu and J. Zá vodný . Size bounds for factorised representations of query results. ACM Trans. Database Syst., 40(1):2:1--2:44, 2015.
[28]
W. N. Street and Y. Kim. A streaming ensemble algorithm (SEA) for large-scale classification. In KDD, pages 377--382. ACM, 2001.
[29]
N. Tziavelis, D. Ajwani, W. Gatterbauer, M. Riedewald, and X. Yang. Optimal algorithms for ranked enumeration of answers to full conjunctive queries. CoRR, abs/1911.05582, 2019.
[30]
P. van Emde Boas. Preserving order in a forest in less than logarithmic time. In 16th Annual Symposium on Foundations of Computer Science, Berkeley, California, USA, October 13--15, 1975, pages 75--84, 1975.
[31]
V. V. Williams. Multiplying matrices faster than coppersmith-winograd. In STOC, pages 887--898, 2012.
[32]
M. Yannakakis. Algorithms for acyclic database schemes. In VLDB, VLDB '81, pages 82--94. VLDB Endowment, 1981.
[33]
Z. Zhao, R. Christensen, F. Li, X. Hu, and K. Yi. Random sampling over joins revisited. In SIGMOD, pages 1525--1539. ACM, 2018.

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
PODS'20: Proceedings of the 39th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems
June 2020
480 pages
ISBN:9781450371087
DOI:10.1145/3375395
  • General Chair:
  • Dan Suciu,
  • Program Chair:
  • Yufei Tao,
  • Publications Chair:
  • Zhewei Wei
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 14 June 2020

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. complexity
  2. enumeration
  3. unions of conjunctive queries

Qualifiers

  • Research-article

Funding Sources

Conference

SIGMOD/PODS '20
Sponsor:

Acceptance Rates

Overall Acceptance Rate 642 of 2,707 submissions, 24%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)24
  • Downloads (Last 6 weeks)2
Reflects downloads up to 31 Dec 2024

Other Metrics

Citations

Cited By

View all

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media