skip to main content
10.1145/1514894.1514916acmotherconferencesArticle/Chapter ViewAbstractPublication PagesedbtConference Proceedingsconference-collections
research-article
Free access

Efficient asymmetric inclusion between regular expression types

Published: 23 March 2009 Publication History

Abstract

The inclusion of Regular Expressions (REs) is the kernel of any subtype checking algorithm for XML schema languages. XML applications would benefit from the extension of REs with interleaving and counting, but this is not feasible in general, since inclusion is EXPSPACE-complete for such extended REs. In [9] we introduced a notion of "conflict-free REs", which are extended REs with excellent complexity behaviour, including a cubic inclusion algorithm [9] and linear membership [10]. Conflict-free REs have interleaving and counting, but the complexity is tamed by the "conflict-free" limitations, which have been found to be satisfied by the vast majority of the content models published on the Web.
However, the most important use of subtype checking is in the context of type-cheching of XML manipulation languges. A type checker works by testing the inclusion of inferred subtypes in declared supertypes. The conflict-free restriction, while quite harmless for the human-defined supertype, is far too restrictive for the inferred subtype, whose shape is difficult to constrain.
We show here that the PTIME inclusion algorithm can be actually extended to deal with totally unrestricted REs with counting and interleaving in the subtype position, provided that the supertype is conflict-free. This is exactly the expressive power that we need in order to use subtyping inside type-checking algorithms, and the cost of this generalized algorithm is only quadratic, which is as good as the best algorithm we have for the symmetric case (see [5]). The result is extremely surprising, since we had previously found that asymmetric inclusion becomes NP-hard as soon as the candidate subtype is enriched with binary intersection, a generalization that looked much more innocent than what we achieve here.

References

[1]
D. Barbosa, G. Leighton, and A. Smith. Efficient incremental validation of XML documents after composite updates. In XSym, volume 4156 of LNCS, pages 107--121. Springer, 2006.
[2]
M. A. Bender and M. Farach-Colton. The LCA problem revisited. In G. H. Gonnet, D. Panario, and A. Viola, editors, LATIN, volume 1776 of Lecture Notes in Computer Science, pages 88--94. Springer, 2000.
[3]
G. J. Bex, F. Neven, T. Schwentick, and K. Tuyls. Inference of concise dtds from xml data. In U. Dayal, K.-Y. Whang, D. B. Lomet, G. Alonso, G. M. Lohman, M. L. Kersten, S. K. Cha, and Y.-K. Kim, editors, Proceedings of the 32nd International Conference on Very Large Data Bases, Seoul, Korea, September 12--15, 2006, pages 115--126. ACM, 2006.
[4]
B. Choi. What are real DTDs like? In WebDB, pages 43--48, 2002.
[5]
D. Colazzo, G. Ghelli, and C. Sartiani. Efficient inclusion for a class of xml types with interleaving and counting. Information Systems, 2008. To Appear.
[6]
D. Colazzo and C. Sartiani. Efficient subtyping for unordered XML types. Technical report, Dipartimento di Informatica - Università di Pisa, 2007.
[7]
J. N. Foster, B. C. Pierce, and A. Schmitt. A logic your typechecker can count on: Unordered tree types in practice. In Workshop on Programming Language Technologies for XML (PLAN-X), informal proceedings, Jan. 2007.
[8]
W. Gelade, W. Martens, and F. Neven. Optimizing schema languages for xml: Numerical constraints and interleaving. In T. Schwentick and D. Suciu, editors, Proceedings of the 11th International Conference on Database Theory - ICDT 2007, Barcelona, Spain, January 10--12, 2007, volume 4353 of Lecture Notes in Computer Science, pages 269--283. Springer, 2007.
[9]
G. Ghelli, D. Colazzo, and C. Sartiani. Efficient inclusion for a class of XML types with interleaving and counting. In M. Arenas and M. I. Schwartzbach, editors, Proceedings of the 11th International Symposium on Database Programming Languages, DBPL 2007, Vienna, Austria, September 23--24, 2007, Revised Selected Papers, volume 4797 of Lecture Notes in Computer Science, pages 231--245. Springer, 2007.
[10]
G. Ghelli, D. Colazzo, and C. Sartiani. Linear time membership for a class of XML types with interleaving and counting. In ACM Conference on Information and Knowledge Management (CIKM), 2008.
[11]
W. Martens, F. Neven, T. Schwentick, and G. J. Bex. Expressiveness and complexity of xml schema. ACM Trans. Database Syst., 31(3):770--813, 2006.
[12]
A. J. Mayer and L. J. Stockmeyer. Word problems -- this time with interleaving. Inf. Comput., 115(2):293--311, 1994.

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences
ICDT '09: Proceedings of the 12th International Conference on Database Theory
March 2009
334 pages
ISBN:9781605584232
DOI:10.1145/1514894
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 23 March 2009

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. XML
  2. language inclusion
  3. regular expressions

Qualifiers

  • Research-article

Conference

EDBT/ICDT '09
EDBT/ICDT '09: EDBT/ICDT '09 joint conference
March 23 - 25, 2009
St. Petersburg, Russia

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)67
  • Downloads (Last 6 weeks)10
Reflects downloads up to 04 Jan 2025

Other Metrics

Citations

Cited By

View all

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media