skip to main content
10.5555/2889875.2889881guideproceedingsArticle/Chapter ViewAbstractPublication PagesConference Proceedingsacm-pubtype
Article

Using web data provenance for quality assessment

Published: 25 October 2009 Publication History

Abstract

The Web of Data cannot be a trustworthy data source unless an approach for evaluating the quality of data on the Web is established and integrated as part of the data publication and access process. In this paper, we propose an approach of using provenance information about the data on theWeb to assess their quality and trustworthiness. Our contributions include a model for Web data provenance and an assessment method that can be adapted for specific quality criteria. We demonstrate how this method can be used to evaluate the timeliness of data on the Web, to reflect how up-to-date the data is. We also propose a possible solution to deal with missing provenance information by associating certainty values with calculated quality values.

References

[1]
C. Bizer, T. Heath, and T. Berners-Lee, "Linked Data - The Story So Far," Int. Journal on Semantic Web and Information Systems, Special Issue on Linked Data, 2009, in press.
[2]
W. L. Yang, M. S. Diane, B. K. Kahn, and Y. W. Richard, "AIMQ: a methodology for information quality assessment," Information & Management, vol. 40, no. 2, 2002.
[3]
O. Hartig, "Provenance Information in the Web of Data," in Proc. of the Linked Data on the Web Workshop at WWW, 2009.
[4]
F. Naumann, Quality-driven query answering for integrated information systems. Springer Verlag, 2002.
[5]
M. Bobrowski, M. Marré, and D. Yankelevich., "A homogeneous framework to measure data quality," in Proc. of IQ, 1999.
[6]
A. Motro and I. Rakov, "Estimating the quality of databases," in Proc. of FQAS, 1998.
[7]
J.-R. Gruser, L. Raschid, V. Zadorozhny, and T. Zhan, "Learning response time for websources using query feedback and application in query optimization," VLDB Journal, vol. 9, no. 1, 2000.
[8]
D. Ballou, R. Wang, H. Pazer, and G. K. Tayi, "Modeling Information Manufacturing Systems to Determine Information Product Quality," Management Science, vol. 44, no. 4, 1998.
[9]
S. C. Wong, S. Miles, W. Fang, P. Groth, and L. Moreau, "Provenance-based validation of e-science experiments," in Proc. of ISWC, 2005.
[10]
J. Golbeck and A. Mannes, "Using Trust and Provenance for Content Filtering on the Semantic Web," in Proc. of the Models of Trust for the Web Workshop at WWW, 2006.
[11]
Y. Simmhan, B. Plale, and D. Gannon, "A Survey of Data Provenance in e-Science," SIGMOD Record, vol. 34, no. 3, 2005.
[12]
W. C. Tan, "Provenance in Databases: Past, Current, and Future," IEEE Data Engineering Bulletin, vol. 30, no. 4, 2007.
[13]
L. L. Pipino, Y. W. Lee, and R. Y. Wang, "Data Quality Assessment," Communications of the ACM, vol. 45, no. 4, 2002.
[14]
C. Bizer, Quality-Driven Information Filtering in the Context of Web-Based Information Systems. VDM Verlag, 2007.
[15]
S. de F. Mendes Sampaio, C. Dong, and P. Sampaio, "Incorporating the Timeliness Quality Dimension in Internet Query Systems," in Proc. of WISE, 2005.

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image Guide Proceedings
SWPM'09: Proceedings of the First International Conference on Semantic Web in Provenance Management - Volume 526
October 2009
47 pages
  • Editors:
  • Juliana Freire,
  • Paolo Missier,
  • Satya S. Sahoo

Publisher

CEUR-WS.org

Aachen, Germany

Publication History

Published: 25 October 2009

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 23 Jan 2025

Other Metrics

Citations

Cited By

View all

View Options

View options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media