skip to main content
10.1145/1353343.1353410acmotherconferencesArticle/Chapter ViewAbstractPublication PagesedbtConference Proceedingsconference-collections
research-article
Free access

Querying time-series streams

Published: 25 March 2008 Publication History

Abstract

Index trees created using distance based indexing are difficult to maintain online since the distance function involved is often costly to compute. This problem is intensified when the database we are dealing with, is frequently updated, as only limited time is available to perform the maintenance. In this paper, we propose a novel tree maintenance mechanism for the problem of answering approximate k-Nearest Neighbor queries with a probabilistic guarantee on timeseries streams. When the underlying data change, we may choose to defer updating the tree as long as the probabilistic guarantee of answering queries is high. To prolong such deferment, we present innovative techniques that maintain the utility of the tree by migrating its pivots and by partially reconstructing it. As the probabilistic guarantee decays with time and crosses the minimum guarantee threshold, all of the deferred updates are performed. In essence, our work offers an elegant compromise between the accuracy guarantee of query results and the cost of providing them. With extensive empirical studies, we also show the flexibility and efficiency of our approach.

References

[1]
B. Babcock, S. Babu, M. Datar, R. Motwani, and J. Widom. Models and issues in data stream systems. In PODS, pages 1--16, 2002.
[2]
B. Babcock, M. Datar, R. Motwani, and L. O'Callaghan. Maintaining variance and k-medians over data stream windows. In PODS, pages 234--243, 2003.
[3]
T. Bozkaya and Z. M. Özsoyoglu. Indexing large metric spaces for similarity search queries. ACM Trans. Database Syst., 24(3):361--404, 1999.
[4]
S. Brin. Near neighbor search in large metric spaces. In VLDB, pages 574--584, 1995.
[5]
B. Bustos, G. Navarro, and E. Chávez. Pivot selection techniques for proximity searching in metric spaces. Pattern Recognition Letters, 24(14):2357--2366, 2003.
[6]
Y. Cai, K. A. Hua, G. Cao, and T. Xu. Real-time processing of range-monitoring queries in heterogeneous mobile databases. IEEE Trans. Mob. Comput., 5(7):931--942, 2006.
[7]
L. Chen and R. T. Ng. On the marriage of Lp-norms and edit distance. In VLDB, pages 792--803, 2004.
[8]
R. Cheng, D. V. Kalashnikov, and S. Prabhakar. Querying imprecise data in moving object environments. IEEE Trans. Knowl. Data Eng., 16(9):1112--1127, 2004.
[9]
P. Ciaccia, M. Patella, and P. Zezula. M-tree: An efficient access method for similarity search in metric spaces. In VLDB, pages 426--435, 1997.
[10]
B. Cui, B. C. Ooi, J. Su, and K.-L. Tan. Contorting high dimensional data for efficient main memory processing. In SIGMOD Conference, pages 479--490, 2003.
[11]
Y. Diao, M. Altinel, M. J. Franklin, H. Zhang, and P. M. Fischer. Path sharing and predicate evaluation for high-performance XML filtering. ACM Trans. Database Syst., 28(4):467--516, 2003.
[12]
A. W.-C. Fu, P. M.-S. Chan, Y.-L. Cheung, and Y. S. Moon. Dynamic vp-tree indexing for n-nearest neighbor search given pair-wise distances. VLDB J., 9(2):154--173, 2000.
[13]
L. Gao and X. S. Wang. Continually evaluating similarity-based pattern queries on a streaming time series. In SIGMOD Conference, pages 370--381, 2002.
[14]
V. Gopalkrishnan, P. Chairunnanda, and A. Najib. Efficient index maintenance for answering approximate queries in a volatile environment. Technical Report TR/VG/07/03, CAIS, Nanyang Technological University, Singapore, 2007.
[15]
G. R. Hjaltason and H. Samet. Index-driven similarity search in metric spaces. ACM Trans. Database Syst., 28(4):517--580, 2003.
[16]
H. V. Jagadish, B. C. Ooi, K.-L. Tan, C. Yu, and R. Zhang. iDistance: An adaptive B+-tree based indexing method for nearest neighbor search. ACM Trans. Database Syst., 30(2):364--397, 2005.
[17]
N. Koudas, B. C. Ooi, K.-L. Tan, and R. Z. 0003. Approximate NN queries on streams with guaranteed error/performance bounds. In VLDB, pages 804--815, 2004.
[18]
Y.-N. Law and C. Zaniolo. An adaptive nearest neighbor classification algorithm for data streams. In PKDD, pages 108--120, 2005.
[19]
B. Liu, W.-C. Lee, and D. L. Lee. Distributed caching of multi-dimensional data in mobile environments. In Mobile Data Management, pages 229--233, 2005.
[20]
W. P. M. Polly and M. H. Wong. Efficient and robust feature extraction and pattern matching of time series by a lattice structure. In CIKM, pages 271--278, 2001.
[21]
S. Saltenis, C. S. Jensen, S. T. Leutenegger, and M. A. Lopez. Indexing the positions of continuously moving objects. In SIGMOD Conference, pages 331--342, 2000.
[22]
Y. Tao, R. Cheng, X. Xiao, W. K. Ngai, B. Kao, and S. Prabhakar. Indexing multi-dimensional uncertain data with arbitrary probability density functions. In VLDB, pages 922--933, 2005.
[23]
J. K. Uhlmann. Metric trees. Appl. Math. Lett., 4(5):61--62, 1991.
[24]
J. K. Uhlmann. Satisfying general proximity/similarity queries with metric trees. Inf. Process. Lett., 40(4):175--179, 1991.
[25]
M. Vlachos, M. Hadjieleftheriou, D. Gunopulos, and E. J. Keogh. Indexing multi-dimensional time-series with support for multiple distance measures. In KDD, pages 216--225, 2003.
[26]
M. Vlachos, C. Meek, Z. Vagena, and D. Gunopulos. Identifying similarities, periodicities and bursts for online search queries. In SIGMOD Conference, pages 131--142, 2004.
[27]
L. Wei, E. J. Keogh, H. V. Herle, and A. Mafra-Neto. Atomic wedgie: Efficient query filtering for streaming times series. In ICDM, pages 490--497, 2005.
[28]
O. Wolfson, A. P. Sistla, S. Chamberlain, and Y. Yesha. Updating and querying databases that track mobile units. Distributed and Parallel Databases, 7(3):257--387, 1999.
[29]
H. Wu, B. Salzberg, and D. Zhang. Online event-driven subsequence matching over financial data streams. In SIGMOD Conference, pages 23--34, 2004.
[30]
Y. Xia and S. Prabhakar. Q+Rtree: Efficient indexing for moving object database. In DASFAA, pages 175--182, 2003.
[31]
P. N. Yianilos. Data structures and algorithms for nearest neighbor search in general metric spaces. In SODA, pages 311--321, 1993.

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences
EDBT '08: Proceedings of the 11th international conference on Extending database technology: Advances in database technology
March 2008
762 pages
ISBN:9781595939265
DOI:10.1145/1353343
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 25 March 2008

Permissions

Request permissions for this article.

Check for updates

Qualifiers

  • Research-article

Conference

EDBT '08

Acceptance Rates

Overall Acceptance Rate 7 of 10 submissions, 70%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)32
  • Downloads (Last 6 weeks)4
Reflects downloads up to 01 Jan 2025

Other Metrics

Citations

Cited By

View all

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media