skip to main content
10.1145/1014052.1014059acmconferencesArticle/Chapter ViewAbstractPublication PageskddConference Proceedingsconference-collections
Article

Recovering latent time-series from their observed sums: network tomography with particle filters.

Published: 22 August 2004 Publication History

Abstract

Hidden variables, evolving over time, appear in multiple settings, where it is valuable to recover them, typically from observed sums. Our driving application is 'network tomography', where we need to estimate the origin-destination (OD) traffic flows to determine, e.g., who is communicating with whom in a local area network. This information allows network engineers and managers to solve problems in design, routing, configuration debugging, monitoring and pricing. Unfortunately the direct measurement of the OD traffic is usually difficult, or even impossible; instead, we can easily measure the loads on every link, that is, sums of desirable OD flows.In this paper we propose i-FILTER, a method to solve this problem, which improves the state-of-the-art by (a) introducing explicit time dependence, and by (b) using realistic, non-Gaussian marginals in the statistical models for the traffic flows, as never attempted before. We give experiments on real data, where i-FILTER scales linearly with new observations and out-performs the best existing solutions, in a wide variety of settings. Specifically, on real network traffic measured at CMU, and at AT&T, i-FILTER reduced the estimation errors between 15% and 46% in all cases.

References

[1]
E. M. Airoldi. Advances in network tomography. Technical Report CMU-CALD-03-101, Carnegie Mellon University, 2003.
[2]
E. M. Airoldi and C. Faloutsos. Recovering Latent Time-Series from their Observed Sums. Technical Report CMU-CS-04-130, Carnegie Mellon University, 2004.
[3]
Y. Bejerano, Y. Breitbart, M. Garofalakis, and R. Rastogi. Physical topology discovery for large multi-subnet networks. In Proceedings of IEEE INFOCOM, pages 342--352, 2003.
[4]
Z. Bi, C. Faloutsos, and F. Korn. The DGX distribution for mining massive, skewed data. In Proceedings of ACM SIGKDD, pages 17--26, 2001.
[5]
J. Cao, D. Davis, S. Van Der Viel, and B. Yu. Time-varying network tomography: router link data. Journal of the American Statistical Association, 95:1063--75, 2000.
[6]
J. Cao, D. Davis, S. Van Der Viel, B. Yu, and Z. Zu. A scalable method for estimating network traffic matrices from link counts. Technical report, Bell Labs, 2001.
[7]
S. E. Fienberg. The geometry of an r x c contingency table. Annals of Mathematical Statistics, 39:1186--90, 1968.
[8]
S. E. Fienberg. The geometry of a two by two contingency table. Journal of the American Statistical Association, 65:694--701, 1970a.
[9]
W. R. Gilks and C. Berzuini. Following a moving target | monte carlo inference for dynamic bayesian models. Journal of the Royal Statistical Society, Series B, 63:127--146, 2001.
[10]
T. Higuchi. Self-organizing time series model. In A. Doucet, N. de Freitas, and N. Gordon, editors, Sequential Monte Carlo Methods in Practice. Springer-Verlag, 2001.
[11]
G. Liang and B. Yu. Pseudo-likelihood estimations in network tomography. In Proceedings of IEEE INFOCOM, 2003.
[12]
J. Liu. Monte Carlo Strategies in Scientific Computing. Springer-Verlag, 2001.
[13]
B. Mandelbrot. An informational theory of the statistical structure of language. In W. Jackson, editor, Communication Theory. Butterworths, 1953.
[14]
A. Medina, N. Taft, K. Salamatian, S. Bhattacharyya, and C. Diot. Traffic matrix estimation: existing techniques and new directions. In Proceedings of ACM SIGCOMM, pages 161--174, 2002.
[15]
I. Rish, M. Brodie, and S. Ma. Accuracy vs. efficiency trade-offs in probabilistic diagnosis. In Proceedings of AAAI, 2002.
[16]
I. Rish, M. Brodie, N. Odintsova, S. Ma, and G. Grabarnik. Real-time problem determination in distributed systems using active probing. To appear in NOMS, 2004.
[17]
C. and G. Casella. Monte Carlo Statistical Methods. Springer-Verlag, 1999.
[18]
C. Tebaldi and M. West. Bayesian inference on network traffic using link count data. Journal of the American Statistical Association, 93:557--576, 1998.
[19]
R. J. Vanderbei and J. Iannone. An em approach to od matrix estimation. Technical Report SOR 94-04, Princeton University, 1994.
[20]
Y. Vardi. Network tomography: estimating source-destination traffic intensities from link data. Journal of the American Statistical Association, 91:365--377, 1996.
[21]
S. Vaton and A. Gravey. Iterative bayesian estimation of network traffic matrices in the case of bursty flows. In Proceedings of the second ACM SIGCOMM Workshop on Internet measurment, pages 89--90, 2002.
[22]
W. Wei, B. Wang, D. Towsley. Continuous-time Hidden Markov Models for Network Performance Evaluation. Performance Evaluation, 49:129--146, 2002.
[23]
Y. Zhang, M. Roughan, N. Duffield, and A. Greenberg. Fast accurate computation of large-scale IP traffic matrices from link loads. In Proceedings of SIGMETRICS, 2003.
[24]
Y. Zhang, M. Roughan, C. Lund, and D. Donoho. An information-theoretic approach to traffic matrix estimation. In Proceedings of SIGCOMM, 2003.
[25]
G. K. Zipf. Selected Studies of the Principle of Relative Frequency in Language. Harvard University Press, 1932.

Cited By

View all

Index Terms

  1. Recovering latent time-series from their observed sums: network tomography with particle filters.

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    KDD '04: Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
    August 2004
    874 pages
    ISBN:1581138881
    DOI:10.1145/1014052
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 22 August 2004

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. MCMC
    2. empirical bayes
    3. informative priors
    4. link loads
    5. origin-destination traffic flows
    6. particle filter
    7. self-organizing Bayesian dynamical system

    Qualifiers

    • Article

    Conference

    KDD04

    Acceptance Rates

    Overall Acceptance Rate 1,133 of 8,635 submissions, 13%

    Upcoming Conference

    KDD '25

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)2
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 09 Jan 2025

    Other Metrics

    Citations

    Cited By

    View all

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media