skip to main content
10.1145/2811411.2811470acmconferencesArticle/Chapter ViewAbstractPublication PagesracsConference Proceedingsconference-collections
research-article

Quantile estimation using the theory of stochastic learning

Published: 09 October 2015 Publication History

Abstract

The goal of our research is to estimate the quantiles of a distribution from a large set of samples that arrive sequentially. Since the data set is large, the model we choose is that the data cannot be stored, but rather that estimates of the quantiles are computed in a real-time setting. In such settings, classical estimators that require storing the whole history of the data (or stream) cannot be deployed. In this paper, we present an incremental quantile estimator of a distribution, i.e., one that utilizes the previously-computed estimates and only resorts to the last sample for updating these estimates. The state-of-the-art work on obtaining incremental quantile estimators is due to Tierney [9], and is based on the theory of stochastic approximation. However, a primary shortcoming of the latter work is the requirement to incrementally build local approximations of the distribution function in the neighborhood of the quantiles. This requirement, unfortunately, increases the complexity of the algorithm. The convergence of the estimator suggested here is proven using the theory of stochastic learning. These theoretical results have been verified experimentally, and they demonstrate that our estimator outperforms the state-of-the-art estimators. In addition, it also copes with dynamic environments.

References

[1]
A. Arasu and G. S. Manku. Approximate counts and quantiles over sliding windows. In Proceedings of the twenty-third ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems, pages 286--296. ACM, 2004.
[2]
J. Cao, L. Li, A. Chen, and T. Bu. Tracking quantiles of network data streams with dynamic operations. In INFOCOM, 2010 Proceedings IEEE, pages 1--5. IEEE, 2010.
[3]
J. M. Chambers, D. A. James, D. Lambert, and S. V. Wiel. Monitoring networked applications with incremental quantile estimation. Statistical Science, pages 463--475, 2006.
[4]
F. Chen, D. Lambert, and J. C. Pinheiro. Incremental quantile estimation for massive tracking. In Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining, pages 516--522. ACM, 2000.
[5]
G. Cormode and S. Muthukrishnan. An improved data stream summary: the count-min sketch and its applications. Journal of Algorithms, 55(1):58--75, 2005.
[6]
M. Greenwald and S. Khanna. Space-efficient online computation of quantile summaries. In ACM SIGMOD Record, volume 30, pages 58--66. ACM, 2001.
[7]
J. I. Munro and M. S. Paterson. Selection and sorting with limited storage. Theoretical computer science, 12(3):315--323, 1980.
[8]
M. F. Norman. Markov processes and learning models, volume 84. Academic Press New York, 1972.
[9]
L. Tierney. A space-efficient recursive procedure for estimating a quantile of an unknown distribution. SIAM Journal on Scientific and Statistical Computing, 4(4):706--711, 1983.
[10]
B. Weide. Space-efficient on-line selection algorithms. In Computer Science and Statistics: Proceedings of the Eleventh Annual Symposium on the Interface, pages 308--311, 1978.
[11]
A. Yazidi and H. Hammer. An incremental quantile estimator using the theory of stochastic learning. Unabridged journal version of this paper, 2015. To be submitted for publication.

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
RACS '15: Proceedings of the 2015 Conference on research in adaptive and convergent systems
October 2015
540 pages
ISBN:9781450337380
DOI:10.1145/2811411
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 09 October 2015

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. data stream
  2. multiplicative update
  3. quantiles estimation
  4. time varying distributions

Qualifiers

  • Research-article

Conference

RACS '15
Sponsor:

Acceptance Rates

RACS '15 Paper Acceptance Rate 75 of 309 submissions, 24%;
Overall Acceptance Rate 393 of 1,581 submissions, 25%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)5
  • Downloads (Last 6 weeks)0
Reflects downloads up to 05 Jan 2025

Other Metrics

Citations

Cited By

View all

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media