skip to main content
10.1145/2791347.2791382acmotherconferencesArticle/Chapter ViewAbstractPublication PagesssdbmConference Proceedingsconference-collections
research-article

Relaxation of subgraph queries delivering empty results

Published: 29 June 2015 Publication History

Abstract

Graph databases with the property graph model are used in multiple domains including social networks, biology, and data integration. They provide schema-flexible storage for data of a different degree of a structure and support complex, expressive queries such as subgraph isomorphism queries. The exibility and expressiveness of graph databases make it difficult for the users to express queries correctly and can lead to unexpected query results, e.g. empty results. Therefore, we propose a relaxation approach for subgraph isomorphism queries that is able to automatically rewrite a graph query, such that the rewritten query is similar to the original query and returns a non-empty result set. In detail, we present relaxation operations applicable to a query, cardinality estimation heuristics, and strategies for prioritizing graph query elements to be relaxed. To determine the similarity between the original query and its relaxed variants, we propose a novel cardinality-based graph edit distance. The feasibility of our approach is shown by using real-world queries from the DBpedia query log.

References

[1]
N. Bidoit, M. Herschel, and K. Tzompanaki. Query-based Why-Not provenance with NedExplain. In Proc. EDBT, pages 145--156, 2014.
[2]
C. Bornhövd, R. Kubis, W. Lehner, H. Voigt, and H. Werner. Flexible information management, exploration and analysis in SAP HANA. In DATA, pages 15--28, 2012.
[3]
A. Chapman and H. V. Jagadish. Why Not? In Proc. SIGMOD, pages 523--534. ACM, 2009.
[4]
S. Chaudhuri. Generalization and a framework for query modification. In Proc. ICDE, pages 138--145. IEEE Computer Society, 1990.
[5]
X. Gao, B. Xiao, D. Tao, and X. Li. A survey of graph edit distance. Pattern Anal. Appl., pages 113--129, 2010.
[6]
M. Herschel. Wondering why data are missing from query results?: Ask conseil why-not. In Proc. CIKM, pages 2213--2218. ACM, 2013.
[7]
M. Herschel and M. A. Hernández. Explaining missing answers to SPJUA queries. Proc. VLDB Endow., pages 185--196, 2010.
[8]
M. Herschel, M. A. Hernández, and W.-C. Tan. Artemis: A system for analyzing missing answers. Proc. VLDB Endow., pages 1550--1553, 2009.
[9]
J. Huang, T. Chen, A. Doan, and J. F. Naughton. On the provenance of non-answers to queries over extracted data. Proc. VLDB Endow., pages 736--747, 2008.
[10]
D. Jannach. Techniques for fast query relaxation in content-based recommender systems. In Proc. KI, pages 49--63. Springer-Verlag, 2007.
[11]
U. Junker. QUICKXPLAIN: preferred explanations and relaxations for over-constrained problems. In AAAI, pages 167--172, 2004.
[12]
F. Mandreoli, R. Martoglia, G. Villani, and W. Penzo. Flexible query answering on graph-modeled data. In Proc. EDBT, pages 216--227. ACM, 2009.
[13]
D. Mottin, A. Marascu, S. Basu Roy, G. Das, T. Palpanas, and Y. Velegrakis. IQR: An interactive query relaxation system for the empty-answer problem. In Proc. SIGMOD, pages 1095--1098. ACM, 2014.
[14]
D. Mottin, A. Marascu, S. B. Roy, G. Das, T. Palpanas, and Y. Velegrakis. A probabilistic optimization framework for the empty-answer problem. Proc. VLDB Endow., pages 1762--1773, 2013.
[15]
R. Myers, R. Wison, and E. R. Hancock. Bayesian graph edit distance. IEEE Trans. Pattern Anal. Mach. Intell., pages 628--635, 2000.
[16]
M. Neuhaus and H. Bunke. A probabilistic approach to learning costs for graph edit distance. In Proc. ICPR, pages 389--393. IEEE, 2004.
[17]
M. Neuhaus and H. Bunke. Self-organizing maps for learning the edit costs in graph matching. Systems, Man, and Cybernetics, Part B: Cybernetics, IEEE Transactions on, pages 503--514, 2005.
[18]
M. Paradies, M. Rudolf, C. Bornhövd, and W. Lehner. GRATIN: Accelerating graph traversals in main-memory column stores. In GRADES, pages 9:1--9:6. ACM, 2014.
[19]
M. A. Rodriguez and P. Neubauer. Constructions from dots and lines. Bulletin of the American Society for Inf. Science and Technology, pages 35--41, 2010.
[20]
M. Rudolf, M. Paradies, C. Bornhövd, and W. Lehner. The graph story of the SAP HANA database. In Datenbanksysteme für Business, Technologie und Web (BTW). Proceedings, pages 403--420, 2013.
[21]
T. Tran, H. Wang, S. Rudolph, and P. Cimiano. Top-k exploration of query candidates for efficient keyword search on graph-shaped (RDF) data. In Proc. ICDE, pages 405--416, 2009.
[22]
E. Vasilyeva, M. Thiele, C. Bornhövd, and W. Lehner. GraphMCS: Discover the unknown in large data graphs. In Workshops EDBT, pages 200--207, 2014.
[23]
E. Vasilyeva, M. Thiele, C. Bornhovd, and W. Lehner. Top-k differential queries in graph databases. In Advances in Databases and Information Systems, Lecture Notes in Computer Science, pages 112--125. Springer International Publishing, 2014.
[24]
J. Wei. Markov edit distance. IEEE Trans. Pattern Anal. Mach. Intell., pages 311--321, 2003.
[25]
R. C. Wilson and E. R. Hancock. Structural matching by discrete relaxation. IEEE Trans. Pattern Anal. Mach. Intell., pages 634--648, 1997.
[26]
H. Zhou, A. A. Shaverdian, H. Jagadish, and G. Michailidis. Querying graphs with uncertain predicates. In Proc. Eighth Workshop on Mining and Learning with Graphs, pages 163--170. ACM, 2010.

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences
SSDBM '15: Proceedings of the 27th International Conference on Scientific and Statistical Database Management
June 2015
390 pages
ISBN:9781450337090
DOI:10.1145/2791347
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 29 June 2015

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. "why empty?"-query
  2. empty-answer problem
  3. graph database
  4. query relaxation

Qualifiers

  • Research-article

Funding Sources

  • FP7 EU project Linked-Design

Conference

SSDBM 2015

Acceptance Rates

Overall Acceptance Rate 56 of 146 submissions, 38%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)4
  • Downloads (Last 6 weeks)0
Reflects downloads up to 01 Feb 2025

Other Metrics

Citations

Cited By

View all

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media