skip to main content
10.1145/2675743.2772587acmconferencesArticle/Chapter ViewAbstractPublication PagesdebsConference Proceedingsconference-collections
research-article

Top-K queries in RDF graph-based stream processing with actors

Published: 24 June 2015 Publication History

Abstract

In this paper, we describe our novel system named as RGraSPA an <u>R</u>DF <u>G</u>raph-based <u>S</u>tream <u>P</u>rocessing with <u>A</u>ctors, which adheres to the realm of RDF graph and knowledge reasoning, and uses an actor model for distribution of continuous queries. Furthermore, we present our approach to solve DEBS Grand Challenge by employing our system. RGraSPA uses RDF graph-based event model to encapsulate a set of triples and process them in continuous manner. We also present our synchronised structure traversal algorithm that uses Range tree to store results in a sorted view, where each node of the tree maintains a balanced Multimap Binary Search Tree (BST). The range of each node is adaptive and updated according to the incoming values and defined size of the Multimap BST for each node.
In order to solve the DEBS challenge, we provide a formal method to calculate cell IDs from the longitude and latitude in a streaming fashion and use two Range trees for 10 most frequent routes and profitable areas. Our experimental results show that the query execution time can be optimised by carefully adjusting the cardinality values of Range tree. Our solution processes 1 year worth of RD-Fised data (372 GB) (approx 3.4 billion triples) for Taxis in 1.8 hours.

References

[1]
RDF 1.1 primer. Technical report, World Wide Web Consortium, 2014.
[2]
D. J. Abadi, A. Marcus, S. R. Madden, and K. Hollenbach. Scalable semantic web data management using vertical partitioning. In Proceedings of the 33rd International Conference on Very Large Data Bases, VLDB '07, pages 411--422. VLDB Endowment, 2007.
[3]
B. Babcock and C. Olston. Distributed top-k monitoring. In Proceedings of the 2003 ACM SIGMOD International Conference on Management of Data, SIGMOD '03, pages 28--39, New York, NY, USA, 2003. ACM.
[4]
D. F. Barbieri, D. Braga, S. Ceri, E. Della Valle, and M. Grossniklaus. C-sparql: Sparql for continuous querying. In Proceedings of the 18th International Conference on World Wide Web, WWW '09, pages 1061--1062, New York, NY, USA, 2009. ACM.
[5]
J. Bentley. Algorithms for kleeâĂ&Zacute;s rectangle problems. Technical Report, Carnegie-Mellon University, Pittsburgh, Penn., Department of Computer Science, 1977.
[6]
C. Hewitt. Actor model for discretionary, adaptive concurrency. CoRR, abs/1008.1459, 2010.
[7]
J. Huang, K. Venkatraman, and D. J. Abadi. Query optimization of distributed pattern matching. In ICDE 2014, 2014.
[8]
N. Jain, S. Mishra, and Srinivasan. Towards a streaming sql standard. Proc. VLDB Endow., 1(2):1379--1390, Aug. 2008.
[9]
C. Jin, K. Yi, L. Chen, J. X. Yu, and X. Lin. Sliding-window top-k queries on uncertain streams. PVLDB, 1(1):301--312, 2008.
[10]
C. Karney. Transverse mercator with an accuracy of a few nanometers. Journal of Geodesy, 85(8):475--485, 2011.
[11]
C. Kim, T. Kaldewey, V. W. Lee, E. Sedlar, A. D. Nguyen, N. Satish, J. Chhugani, A. Di Blas, and P. Dubey. Sort vs. hash revisited: Fast join implementation on modern multi-core cpus. Proc. VLDB Endow., 2(2):1378--1389, Aug. 2009.
[12]
D. Le-Phuoc, M. Dao-Tran, J. X. Parreira, and M. Hauswirth. A native and adaptive approach for unified processing of linked streams and linked data. In Proceedings of the 10th International Conference on The Semantic Web - Volume Part I, ISWC'11, pages 370--388, Berlin, Heidelberg, 2011. Springer-Verlag.
[13]
G. Malewicz, M. H. Austern, and Bik. Pregel: A system for large-scale graph processing. SIGMOD '10, pages 135--146, New York, NY, USA, 2010. ACM.
[14]
A. Metwally, D. Agrawal, and A. El Abbadi. Efficient computation of frequent and top-k elements in data streams. ICDT'05, pages 398--412, Berlin, Heidelberg, 2005. Springer-Verlag.
[15]
J. Pérez, M. Arenas, and C. Gutierrez. Semantics and complexity of SPARQL. ACM Transactions on Database Systems, 34(3):1--45, 2009.
[16]
W. Rao, L. Chen, S. Chen, and S. Tarkoma. Evaluating continuous top-k queries over document streams. World Wide Web, 17(1):59--83, Jan. 2014.
[17]
M. Stonebraker, U. Çetintemel, and S. Zdonik. The 8 requirements of real-time stream processing. SIGMOD Rec., 34(4):42--47, Dec. 2005.

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
DEBS '15: Proceedings of the 9th ACM International Conference on Distributed Event-Based Systems
June 2015
385 pages
ISBN:9781450332866
DOI:10.1145/2675743
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 24 June 2015

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. RDF graph streaming
  2. range trees
  3. top-K continuous queries

Qualifiers

  • Research-article

Conference

DEBS '15

Acceptance Rates

Overall Acceptance Rate 145 of 583 submissions, 25%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 197
    Total Downloads
  • Downloads (Last 12 months)3
  • Downloads (Last 6 weeks)0
Reflects downloads up to 27 Dec 2024

Other Metrics

Citations

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media