skip to main content
10.1145/1142473.1142525acmconferencesArticle/Chapter ViewAbstractPublication PagesmodConference Proceedingsconference-collections
Article

Meta-data indexing for XPath location steps

Published: 27 June 2006 Publication History

Abstract

XML is the de facto standard for data representation and exchange over the Web. Given the diversity of the information available in XML, it is very useful to annotate XML data with a wide variety of meta-data, such as quality and sensitivity. When querying such XML data, say using XPath, it is important to efficiently identify the data that meet specified constraints on the meta-data. For example, different users may be satisfied with different levels of quality guarantees, or may only have access to different parts of the XML data based on specified security policies. In this paper, we address the problem of efficiently identifying the XML elements along a location step in an XPath query, that satisfy meta-data range constraints, when the meta-data levels are specifically drawn from an ordered domain (e.g., accuracy in [0,1], recency using timestamps, multi-level security, etc.). More specifically, we develop a family of index structures, which we refer to as meta-data indexes, to address this problem. A meta-data index is easily instantiated using a multi-dimensional index structure, such as an R-tree, incorporating novel query and update algorithms. We show that the full meta-data index (FMI), based on associating each XML element with its meta-data level, has a very high update cost for modifying an element's meta-data level. We resolve this problem by designing the inheritance meta-data index (IMI), in which (i) actual meta-data levels are associated only with elements for which this value is explicitly specified, and (ii) inherited meta-data levels and inheritance source nodes are associated with non-leaf nodes of the index structure. We design efficient query (for all XPath axes) and update (of meta-data levels) algorithms for the IMI, and experimentally demonstrate the superiority of the IMI over the FMI using benchmark data sets.

References

[1]
S. Al-Khalifa, H. V. Jagadish, N. Koudas, J. M. Patel, D. Srivastava, and Y. Wu. Structural joins: A primitive for efficient XML query pattern matching. In Proc. of ICDE, 2002.
[2]
A. Berglund, S. Boag, D. Chamberlin, M. F. Fernandez, M. Kay, J. Robie, and J. Simeon. XML path language (XPath) 2.0. W3C Working Draft. Available from http://www.w3.org/TR/xpath20/.
[3]
E. Bertino, S. Castano, and E. Ferrari. Securing XML documents with Author-X. IEEE Internet Computing, 5(3):21--31, 2001.
[4]
D. Bhagwat, L. Chiticariu, W. C. Tan, and G. Vijayvargiya. An annotation management system for relational databases. In Proc. of VLDB, 2004.
[5]
P. Buneman, S. Khanna, and W. Tan. On propagation and deletion of annotations through views. In Proc. of PODS, 2002.
[6]
S. Chawathe, S. Abiteboul, and J. Widom. Representing and querying changes in semistructured data. In Proc. of ICDE, 1998.
[7]
S. Cho, S. Amer-Yahia, L.V.S. Lakshmanan, and D. Srivastava. Optimizing the secure evaluation of twig queries. In Proc. of VLDB, 2002.
[8]
B. F. Cooper, N. Sample, M. J. Franklin, G. R. Hjaltason, and M. Shadmon. A fast index for semistructured data. In Proc. of VLDB, 2001.
[9]
E. Damiani, S. D. C. di Vimercati, S. Paraboschi, and P. Samarati. Design and implementation of an access control processor for XML documents. Computer Networks, 33(1--6):59--75, 2000. Also in WWW9.
[10]
T. Dasu and T. Johnson. Exploratory Data Mining and Data Cleaning. Wiley Publishers, 2003.
[11]
L. Delcambre, D. Maier, S. Bowers, M. Weaver, L. Deng, P. Gorman, J. Ash, M. Lavelle, and J. Lyman. Bundles in captivity: An application of superimposed information. In Proc. of ICDE, 2001.
[12]
V. Gaede and O. Gunther. Multidimensional access methods. ACM Computing Surveys, 30(2), 1998.
[13]
T. Grust. Accelerating XPath location steps. In Proc. of SIGMOD, 2002.
[14]
A. Guttman. R-trees: A dynamic index structure for spatial searching. In Proc. of SIGMOD, 1984.
[15]
H. V. Jagadish, S. Al-Khalifa, A. Chapman, L. V. S. Lakshmanan, A. Nierman, S. Paparizos, J. M. Patel, D. Srivastava, N. Wiwatwattana, Y. Wu, and C. Yu. TIMBER: A native XML database. The VLDB Journal, 11(4):274--291, 2002.
[16]
S. Jajodia and R. Sandhu. Toward a multilevel secure relational data model. In PODS, 1991.
[17]
Q. Li and B. Moon. Indexing and querying XML data for regular path expressions. In Proc. of VLDB, 2001.
[18]
G. Mihaila, L. Raschid, and M.-E. Vidal. Querying "quality of data" metadata. In Proc. of IEEE META-DATA Conference, 1999.
[19]
S. Murthy, D. Maier, and L. Delcambre. Querying bi-level information. In Proc. of WebDB, 2004.
[20]
K.V. Ravikanth, D. Agrawal, A. El-Abbadi, A.K. Singh, and T. Smith. Indexing hierarchical data. Technical Report, UCSB, CS-Tr-9514, 1995.
[21]
H. Schoning. Tamino - A DBMS designed for XML. In Proc. ICDE Conf., pp. 149--154, 2001.
[22]
H. Wang, S. Park, W. Fan and P. S. Yu. ViST: A dynamic index method for querying XML data by tree structures. In Proc. of VLDB, 2003.
[23]
T. Yu, D. Srivastava, L.V.S. Lakshmanan, and H.V. Jagadish. A compressed accessibility map for XML. ACM TODS, 29(2):363--402, 2004.
[24]
J. Widom. Trio: A system for integrated management of data, accuracy, and lineage. In Proc. of CIDR, 2005.

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
SIGMOD '06: Proceedings of the 2006 ACM SIGMOD international conference on Management of data
June 2006
830 pages
ISBN:1595934340
DOI:10.1145/1142473
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 27 June 2006

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. XML
  2. hierarchical inheritance
  3. meta-data index

Qualifiers

  • Article

Conference

SIGMOD/PODS06
Sponsor:

Acceptance Rates

Overall Acceptance Rate 785 of 4,003 submissions, 20%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)3
  • Downloads (Last 6 weeks)1
Reflects downloads up to 08 Feb 2025

Other Metrics

Citations

Cited By

View all

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media