skip to main content
article

Deciding equivalences among conjunctive aggregate queries

Published: 01 April 2007 Publication History

Abstract

Equivalence of aggregate queries is investigated for the class of conjunctive queries with comparisons and the aggregate operators count, count-distinct, min, max, and sum. Essentially, this class contains unnested SQL queries with the above aggregate operators, with a where clause consisting of a conjunction of comparisons, and without a having clause. The comparisons are either interpreted over a domain with a dense order (like the rationals) or with a discrete order (like the integers). Characterizations of equivalence differ for the two cases. For queries with either max or min, equivalence is characterized in terms of dominance mappings, which can be viewed as a generalization of containment mappings. For queries with the count-distinct operator, a sufficient condition for equivalence is given in terms of equivalence of conjunctive queries under set semantics. For some special cases, it is shown that this condition is also necessary. For conjunctive queries with comparisons but without aggregation, equivalence under bag-set semantics is characterized in terms of isomorphism. This characterization essentially remains the same also for queries with the count operator. Moreover, this characterization also applies to queries with the sum operator if the queries have either constants or comparisons, but not both. In the general case (i.e., both comparisons and constants), the characterization of the equivalence of queries with the sum operator is more elaborate. All the characterizations given in the paper are decidable in polynomial space.

References

[1]
Aho, A., Sagiv, Y., and Ullman, J. 1979. Efficient optimization of a class of relational expressions. ACM Trans. Datab. Syst. 4, 4, 435--454.
[2]
Albert, J. 1991. Algebraic properties of bag data types. In Proceedings of the 17th International Conference on Very Large Data Bases (Barcelona, Spain). Morgan-Kaufmann, San Francisco, CA, 211--219.
[3]
Benedikt, M., and Keisler, H. 1997. Expressive power of unary counters. In Proceedings of the 6th International Conference on Database Theory (Delphi, Greece). F. Afrati and P. Kolaitis, Eds. Lecture Notes in Computer Science, vol. 1186. Springer-Verlag, New York, 291--305.
[4]
Benedikt, M., and Libkin, L. 1999. Exact and approximate aggregation in constraint query languages. In Proceedings of the 18th Symposium on Principles of Database Systems (Philadelphia, PA). C. Papadimitriou, Ed. ACM, New York, 102--113.
[5]
Calvanese, D., De Giacomo, G., Lenzerini, M., and Vardi, M. Y. 2000. Containment of conjunctive regular path queries with inverse. In Proceedings of the 7th International Conference on Principles of Knowledge Representation and Reasoning (Breckenridge, CO). A. G. Cohn, F. Giunchiglia, and B. Selman, Eds. Morgan-Kaufmann, San Francisco, CA.
[6]
Calvanese, D., De Giacomo, G., Lenzerini, M., and Vardi, M. Y. 2001. View-based query answering and query containment over semistructured data. In Proceedings of the 8th International Workshop on Database Programming Languages (Frascati, Italy). Springer-Verlag, New York, 40--61.
[7]
Calvanese, D., De Giacomo, G., and Vardi, M. Y. 2003. Decidable containment of recursive queries. In Proceedings of the 9th International Conference on Database Theory (Siena, Italy). Lecture Notes in Computer Science. Springer-Verlag, New York, 330--345.
[8]
Chandra, A., and Merlin, P. 1977. Optimal implementation of conjunctive queries in relational databases. In Proceedings of the 9th Annual ACM Symposium on Theory of Computing (Boulder, CO). ACM, New York, 77--90.
[9]
Chaudhuri, S., and Dayal, U. 1997. An overview of data warehousing and OLAP technology. SIGMOD Record 26, 1, 65--74.
[10]
Chaudhuri, S., Krishnamurthy, S., Potarnianos, S., and Shim, K. 1995. Optimizing queries with materialized views. In Proceedings of the 11th International Conference on Data Engineering (Taipei, China). P. S. Yu and A. Chen, Eds. IEEE Computer Society Press, Los Alamitos, CA.
[11]
Chaudhuri, S., and Vardi, M. 1993. Optimization of real conjunctive queries. In Proceedings of the 12th Symposium on Principles of Database Systems (Washington, DC). ACM, New York.
[12]
Chekuri, C., and Rajaraman, A. 2000. Conjunctive query containment revisited. Theoret. Comput. Sci. 239, 2, 211--229.
[13]
Cohen, S., Nutt, W., and Sagiv, Y. 2001. Equivalences among aggregate queries with negation. In Proceedings of the 20th Symposium on Principles of Database Systems (Santa Barbara, CA). ACM, New York, 155--166.
[14]
Cohen, S., Nutt, W., and Sagiv, Y. 2003. Containment of aggregate queries. In Proceedings of the 9th International Conference on Database Theory (Siena, Italy). Lecture Notes in Computer Science. Springer-Verlag, New york.
[15]
Cohen, S., Nutt, W., and Sagiv, Y. 2005. Equivalences among aggregate queries with negation. ACM Trans. Computat. Logic 6, 2, 328--360.
[16]
Cohen, S., Nutt, W., and Serebrenik, A. 1999. Rewriting aggregate queries using views. In Proceedings of the 18th Symposium on Principles of Database Systems (Philadelphia, PA). C. Papadimitriou, Ed. ACM, New york.
[17]
Dobra, A., Garofalakis, M. N., Gehrke, J., and Rastogi, R. 2002. Processing complex aggregate queries over data streams. In Proceedings of the 2002 ACM SIGMOD International Conference on Management of Data (Madison, WI). ACM, New york, 61--72.
[18]
Etessami, K., and Immerman, N. 2000. Tree canonization and transitive closure. Inf. Comput. 157, 1-2, 2--24.
[19]
Grumbach, S., Rafanelli, M., and Tininini, L. 1999. Querying aggregate data. In Proceedings of the 18th Symposium on Principles of Database Systems (Philadelphia, PA). C. Papadimitriou, Ed. ACM, New York, 174--183.
[20]
Guo, S., Sun, W., and Weiss, M. 1996a. On satisfiability, equivalence, and implication problems involving conjunctive queries in database systems. IEEE Trans. Knowl. Data Eng. 8, 4, 604--616.
[21]
Guo, S., Sun, W., and Weiss, M. 1996b. Solving satisfiability and implication problems in database systems. ACM Trans. Datab. Syst. 21, 2, 270--293.
[22]
Gupta, A., Harinarayan, V., and Quass, D. 1995. Aggregate query processing in data warehouses. In Proceedings of the 21st International Conference on Very Large Data Bases (Zurich, Switzerland). Morgan-Kaufmann, San Francisco, CA.
[23]
Gupta, A., and Mumick, I. S. 1999. Materialized Views: Techniques, Implementations, and Applications. MIT Press, Cambridge, MA.
[24]
Hella, L., Libkin, L., Nurmonen, J., and Wong, L. 1999. Logics with aggregate operators. In Proceedings of the 14th IEEE Symposium on Logic in Computer Science (Trento, Italy). IEEE Computer Society Press, Los Alamitos, CA, 35--44.
[25]
Ioannidis, Y., and Ramakrishnan, R. 1995. Beyond relations as sets. ACM Trans. Datab. Syst. 20, 3, 288--324.
[26]
Johnson, D., and Klug, A. 1983. Optimizing conjunctive queries that contain untyped variables. SIAM J. Comput. 12, 4, 616--640.
[27]
Klug, A. 1982. Equivalence of relational algebra and relational calculus query languages having aggregate functions. J. ACM 29, 3, 699--717.
[28]
Klug, A. 1988. On conjunctive queries containing inequalities. J. ACM 35, 1, 146--160.
[29]
Levy, A., Mumick, I. S., Sagiv, Y., and Shmueli, O. 1993. Equivalence, query-reachability, and satisfiability in datalog extensions. In Proceedings of the 12th Symposium on Principles of Database Systems (Washington, DC). ACM, New York, 109--122.
[30]
Levy, A., Mumick, I. S., Sagiv, Y., and Shmueli, O. 2001. Static analysis in datalog extensions. J. ACM 48, 5, 971--1012.
[31]
Levy, A., and Sagiv, Y. 1995. Semantic query optimization in datalog programs. In Proceedings of the 14th Symposium on Principles of Database Systems (San Jose, CA). ACM, New York, 163--173.
[32]
Madden, S., Szewczyk, R., Franklin, M. J., and Culler, D. 2002. Supporting aggregate queries over ad-hoc wireless sensor networks. In Proceedings of the 4th IEEE Workshop on Mobile Computing Systems and Applications (Callicoon, NY). IEEE Computer Society Press, Los Alamitos, CA, 49--58.
[33]
Miklau, G., and Suciu, D. 2002. Containment and equivalence for an XPath fragment. In Proceedings of the 21st Symposium on Principles of Database Systems (Madison, WI). ACM, New York, 65--76.
[34]
Nutt, W., Sagiv, Y., and Shurin, S. 1998. Deciding equivalences among aggregate queries. In Proceedings of the 17th Symposium on Principles of Database Systems (Seattle, WA). J. Paredaens, Ed. ACM, New York, 214--223. Long version as Report of Esprit LTR DWQ.
[35]
Özsoyoğlu, G., Özsoyoğlu, Z., and Matos, V. 1987. Extending relational algebra and relational calculus with set-valued attributes and aggregate functions. ACM Trans. Datab. Syst. 12, 4, 566--592.
[36]
Popa, L., and Tannen, V. 1999. An equational chase for path-conjunctive queries, constraints, and views. In Proceedings of the 7th International Conference on Database Theory (Jerusalem, Israel). C. Beeri and P. Buneman, Eds. Lecture Notes in Computer Science, vol. 1540. Springer-Verlag, New York, 39--57.
[37]
Ross, K., Srivastava, D., Stuckey, P., and Sudarshan, S. 1998. Foundations of aggregation constraints. Theoret. Comput. Sci. 193, 1-2, 149--179.
[38]
Sagiv, Y., and Saraiya, Y. 1992. Minimizing restricted-fanout queries. Disc. Appl. Math. 40, 245--264.
[39]
Sagiv, Y., and Yannakakis, M. 1981. Equivalence among relational expressions with the union and difference operators. J. ACM 27, 4, 633--655.
[40]
Srivastava, D., Dar, S., Jagadish, H., and Levy, A. 1996. Answering queries with aggregation using views. In Proceedings of the 22nd International Conference on Very Large Data Bases (Bombay, India). Morgan-Kaufmann, San Francisco, CA.
[41]
van der Meyden, R. 1992. The complexity of querying indefinite data about linearly ordered domains. In Proceedings of the 11th Symposium on Principles of Database Systems (San Diego, CA). ACM, New York, 331--345.

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image Journal of the ACM
Journal of the ACM  Volume 54, Issue 2
April 2007
188 pages
ISSN:0004-5411
EISSN:1557-735X
DOI:10.1145/1219092
Issue’s Table of Contents

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 April 2007
Published in JACM Volume 54, Issue 2

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Aggregation
  2. Datalog
  3. bag-set semantics
  4. query equivalence

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)53
  • Downloads (Last 6 weeks)3
Reflects downloads up to 12 Jan 2025

Other Metrics

Citations

Cited By

View all

View Options

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media