skip to main content
10.5555/1316689.1316698dlproceedingsArticle/Chapter ViewAbstractPublication PagesvldbConference Proceedingsconference-collections
Article

Schema-free XQuery

Published: 31 August 2004 Publication History

Abstract

The widespread adoption of XML holds out the promise that document structure can be exploited to specify precise database queries. However, the user may have only a limited knowledge of the XML structure, and hence may be unable to produce a correct XQuery, especially in the context of a heterogeneous information collection. The default is to use keyword-based search and we are all too familiar with how difficult it is to obtain precise answers by these means. We seek to address these problems by introducing the notion of Meaningful Lowest Common Ancestor Structure (MLCAS) for finding related nodes within an XML document. By automatically computing MLCAS and expanding ambiguous tag names, we add new functionality to XQuery and enable users to take full advantage of XQuery in querying XML data precisely and efficiently without requiring (perfect) knowledge of the document structure. Such a Schema-Free XQuery is potentially of value not just to casual users with partial knowledge of schema, but also to experts working in a data integration or data evolution context. In such a context, a schema-free query, once written, can be applied universally to multiple data sources that supply similar content under different schemas, and applied "forever" as these schemas evolve. Our experimental evaluation found that it was possible to express a wide variety of queries in a schema-free manner and have them return correct results over a broad diversity of schemas. Furthermore, the evaluation of a schema-free query is not expensive using a novel stack-based algorithm we develop for computing MLCAS: from 1 to 4 times the execution time of an equivalent schema-aware query.

References

[1]
{1} TIMBER: http://www.eecs.umich.edu/db/timber.
[2]
{2} WordNet: http://www.cogsci.princeton.edu/~wn/.
[3]
{3} XMark: http://monetdb.cwi.nl/xml/index.html.
[4]
{4} B. Aditya et al. BANKS: Browsing and keyword searching in relational databases. In VLDB, 2002.
[5]
{5} S. Agrawal et al. DBXplorer: a system for keyword-based search over relational databases. In ICDE, 2002.
[6]
{6} S. Al-Khalifa et al. Structural joins: A primitive for efficient XML query pattern matching. In ICDE, 2001.
[7]
{7} S. Al-Khalifa et al. Querying structured text in an XML database. In SIGMOD, 2003.
[8]
{8} N. Bruno et al. Holistic twig joins: Optimal XML pattern matching. In SIGMOD, 2002.
[9]
{9} D. Chamberlin. XQuery: An XML query language. IBM System Journal, 41:597-615, 2003.
[10]
{10} S.-Y. Chien et al. Efficient structural joins on indexed XML documents. In VLDB, 2002.
[11]
{11} S. Cohen et al. XSEarch: A semantic search engine for XML. In VLDB, 2003.
[12]
{12} D. Florescu et al. Integrating keyword search into XML query processing. Computer Networks, 33:119-135, 2000.
[13]
{13} N. Fuhr and K. Großjohann. XIRQL: An extension of XQL for information retrieval. In SIGIR, 2000.
[14]
{14} G. W. Furnas et al. The vocabulary problem in human-system communication. CACM, 30(11):964-971, 1987.
[15]
{15} L. Guo et al. XRANK: Ranked keyword search over XML documents. In SIGMOD, 2003.
[16]
{16} A. Halevy et al. Crossing the structure chasm, 2003.
[17]
{17} V. Hristidis et al. Keyword proximity search on XML graphs. In ICDE, 2003.
[18]
{18} V. Hristidis and Y. Papakonstantinou. Discover: Keyword search in relational databases. In VLDB, 2002.
[19]
{19} H. V. Jagadish et al. Timber: A native xml database. The VLDB Journal, 11(4):274-291, 2002.
[20]
{20} M. Ley. DBLP bibliography, 2003.
[21]
{21} D. Quass et al. Querying semistructured heterogeneous information. In DOOD, 1995.
[22]
{22} A. Schmidt et al. Querying XML documents made easy: Nearest concept queries. In ICDE, 2001.
[23]
{23} A. Theobald and G. Weikum. The index-based XXL search engine for querying XML data with relevance ranking. In EDBT, 2002.
[24]
{24} W3C. XML query use cases, 2003.
[25]
{25} W3C. XML schema, 2003.

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image DL Hosted proceedings
VLDB '04: Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
August 2004
1380 pages

Sponsors

  • VLDB Endowment: Very Large Database Endowment

Publisher

VLDB Endowment

Publication History

Published: 31 August 2004

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 31 Jan 2025

Other Metrics

Citations

Cited By

View all

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media