skip to main content
research-article

Characterizing per-application network traffic using entropy

Published: 10 May 2013 Publication History

Abstract

The Internet has been evolving into a more heterogeneous internetwork with diverse new applications imposing more stringent bandwidth and QoS requirements. Already new applications such as YouTube, Hulu, and Netflix are consuming a large fraction of the total bandwidth. We argue that, in order to engineer future internets such that they can adequately cater to their increasingly diverse and complex set of applications while using resources efficiently, it is critical to be able to characterize the load that emerging and future applications place on the underlying network. In this article, we investigate entropy as a metric for characterizing per-flow network traffic complexity. While previous work has analyzed aggregated network traffic, we focus on studying isolated traffic flows. Per-application flow characterization caters to the need of network control functions such as traffic scheduling and admission control at the edges of the network. Such control functions necessitate differentiating network traffic on a per-application basis. The “entropy fingerprints” that we get from our entropy estimator summarize many characteristics of each application's network traffic. Not only can we compare applications on the basis of peak entropy, but we can also categorize them based on a number of other properties of the fingerprints.

References

[1]
Apple. 2010a. iChat in OS X Leopard. http://www.apple.com/asia/macosx/leopard/features/ichat.html.
[2]
Apple. 2010b. iChat Wikipedia entry. http://en.wikipedia.org/wiki/Ichat.
[3]
Basharin, G. 1959. On a statistical estimate for the entropy of a sequence of independent random variables. Theory Probab. Appl. 4, 333.
[4]
Beran, J., Sherman, R., Taqqu, M., and Willinger, W. 1995. Long-range dependence in variable-bit-rate video traffic. IEEE Trans. Comm. 43, 234, 1566--1579.
[5]
Berkeley, L. 2001. National laboratory network research. tcpdump: The protocol packet capture and dumper program. http://www.tcpdump.org. In The Protocol Packet Capture and Dumper Program, 2003. 164.
[6]
Bonfiglio, D., Mellia, M., Meo, M., and Rossi, D. 2009. Detailed analysis of Skype traffic. IEEE Trans. Multimed. 11, 1, 117--127.
[7]
Contributors. 2010. YouTube Wikipedia entry. http://en.wikipedia.org/w/index.php?title=Youtube&oldid= 380031496.
[8]
Cover, T. M. and Thomas, J. A. 1991. Elements of Information Theory. Wiley-Interscience, New York.
[9]
Crovella, M. E. and Bestavros, A. 1996. Self-similarity in world wide web traffic evidence and possible causes. IEEE/ACM Trans. Netwo. 5, 835--846.
[10]
Feinstein, L., Schnackenberg, D., Balupari, R., and Kindred, D. 2003. Statistical approaches to ddos attack detection and response. In Proceedings of the DARPA Information Survivability Conference and Exposition. 303--314.
[11]
Gao, Y., Kontoyiannis, I., and Bienenstock, E. 2006. From the entropy to the statistical structure of spike trains. In Proceedings of the IEEE International Symposium on Information Theory. 645--649.
[12]
Google. 2010. GoogleTalk developer info. http://code.google.com/apis/talk/open_communications.html.
[13]
Hulu. Hulu media faq. http://www.hulu.com/about/media_faq.
[14]
Hunt, N. 2008. Netflix encoding for streaming. http://blog.netflix.com/2008/11/encoding-for-streaming.html.
[15]
Karagiannis, T., Faloutsos, M., and Molle, M. 2003. A user-friendly self-similarity analysis tool. SIGCOMM Comput. Comm. Rev. 33, 3, 81--93.
[16]
Lakhina, A., Crovella, M., and Diot, C. 2005. Mining anomalies using traffic feature distributions. In Proceedings of the Conference on Applications, Technologies, Architectures, and protocols for Computer Communication (SIGCOMM'05). ACM, New York, 217--228.
[17]
Lall, A., Sekar, V., Ogihara, M., Xu, J., and Zhang, H. 2006. Data streaming algorithms for estimating entropy of network traffic. In Proceedings of the Joint International Conference on Measurement and Modeling of Computer Systems (SIGMETRICS'06/Performance'06). ACM, New York, 145--156.
[18]
Leland, W. E., Taqqu, M. S., Willinger, W., and Wilson, D. V. 1993. On the self-similar nature of ethernet traffic. In Conference Proceedings on Communications Architectures, Protocols and Applications (SIGCOMM'93). ACM, New York, 183--193.
[19]
Norris, R. 1998. Markov Chains (Cambridge Series in Statistics and Probabilistic Mathematics). Cambridge University Press.
[20]
Park, K., Kim, G., and Crovella, M. 1996. On the relationship between file sizes, transport protocols, and self-similar network traffic. In Proceedings of the IEEE International Conference on Network Protocols. 171--180.
[21]
Paxson, V. and Floyd, S. 1995. Wide area traffic: The failure of poisson modeling. IEEE/ACM Trans. Netwo. 3, 3, 226--244.
[22]
Perényi, M. and Molnár, S. 2007. Enhanced Skype traffic identification. In Proceedings of the 2nd International Conference on Performance Evaluation Methodologies and Tools (ValueTools'07). ICST (Institute for Computer Sciences, Social-Informatics and Telecommunications Engineering), Brussels, Belgium, 1--9.
[23]
Richman, J. S. and Moorman, J. R. 2000. Physiological time-series analysis using approximate entropy and sample entropy. Amer. J. Physiol. Heart Circ. Physiol. 278, 6, H2039--2049.
[24]
Riihijarvi, J., Wellens, M., and Mahonen, P. 2009. Measuring complexity and predictability in networks with multiscale entropy analysis. In Proceedings of IEEE INFOCOM. 1107--1115.
[25]
Roberts, L. 2009. A radical new router. IEEE Spectrum.
[26]
Rossi, D., Valenti, S., Veglia, P., Bonfiglio, D., Mellia, M., and Meo, M. 2008. Pictures from the Skype. SIGMETRICS Perform. Eval. Rev. 36, 2, 83--86.
[27]
Sandvine Incorporated. 2011. Global internet phenomena report. http://www.sandvine.com/news/global_broadband_trends.asp.
[28]
Sang, A. and Li, S. 2000. A predictability analysis of network traffic. In Proceedings of the 19th Annual Joint Conference of the IEEE Computer and Communications Societies (INFOCOM'00). 342--351
[29]
Shannon, C. E. 1948. A mathematical theory of communication. Bell Syst. Tech. J. 27, 379--423.
[30]
Vu, V., Yu, B., and Kass, R. 2007. Coverage-adjusted entropy estimation. Stat. Med. 26, 21, 4039--4060.
[31]
Wagner, A. and Plattner, B. 2005. Entropy based worm and anomaly detection in fast IP networks. In Proceedings of the 14th IEEE International Workshops on Enabling Technologies: Infrastructure for Collaborative Enterprise (WETICE'05). IEEE Computer Society, Los Alamitos, CA, 172--177.
[32]
Walsworth, C., Aben, E., Claffy, K., and Andersen, D. 2009. The CAIDA anonymized 2009 Internet traces - (jan 15). http://www.caida.org/data/passive/passive_2009_dataset.xml.
[33]
Willems, F., Shtarkov, Y., and Tjalkens, T. 1995. The context-tree weighting method: Basic properties. IEEE Trans. Info. Theory, 41, 3, 653--664.
[34]
Willinger, W., Taqqu, M. S., Sherman, R., and Wilson, D. V. 1997. Self-similarity through high-variability: statistical analysis of ethernet lan traffic at the source level. IEEE/ACM Trans. Netwo. 5, 1, 71--86.
[35]
Xu, K., Zhang, Z.-L., and Bhattacharya, S. 2005. Profiling internet backbone traffic: behavior models and applications. In Proceedings of the Conference on Applications, Technologies, Architectures, and Protocols for Computer Communications (SIGCOMM'05). ACM, New York, 169--180.

Cited By

View all

Index Terms

  1. Characterizing per-application network traffic using entropy

    Recommendations

    Reviews

    Amos O Olagunju

    Several emerging algorithms for data mining, modeling, and simulation offer support for the visualization of business data and information. Business applications with intensive computational needs require the efficient use of bandwidth to effectively operate over the Internet. How should these increasingly multifaceted applications be characterized for the effective engineering of future networks and the Internet__?__ The authors of this paper examine entropy as a measure of the magnitude of the intricacy of per-application network traffic flow. They use real-time applications (Skype, iChat, and Google Talk) and streaming media (Hulu, Netflix, and ABC's webcam stream) to investigate self-similarity and develop an entropy metric for characterizing flow in network traffic. The experimental tcpdump of network traces consists of buffered, bursty, bandwidth-dependent, and codec-dependent applications that use the transmission control protocol (TCP) or user datagram protocol (UDP) to transport data. The cumulative distributions of the packet inter-arrival times of real-time and media streaming flows are displayed to illustrate patterns in the traffic. Flows for Skype voice over Internet protocol (VoIP) and video conferencing show distinct patterns of audio and video traffic. The patterns of iChat audio and video flows were similar, but the Skype audio flow pattern was less complicated than the iChat audio flow pattern. The flow pattern of Google Talk audio was less complicated than the Skype audio traffic flows, and the traffic flow patterns of Google Talk audio and video were different. There were no discernible differences in the traffic pattern flows of Hulu, ABC, and Netflix. The average flow rate of traffic is used to estimate self-similarity for forecasting network traffic flows. There was only weak evidence to support the self-similarity of video and audio flows in iChat and Skype traffic. Consequently, the authors developed a multiscale plug-in that estimates packet timing entropy, to capture the predictable finite memory of arriving packets in time intervals and compute the packet size sequences. They then use a flow trace of packet arrival timestamps to validate the accuracy of the entropy estimator. The entropy estimator is shown to be reliable in predicting the arrival of packets in time intervals for real-time and streaming media audio and video traffic flows. Unlike well-known predictors that assume distribution models to forecast traffic flows [1], the entropy estimator uses a table of bit patterns with associated probabilities in its prediction, without assuming any model. The entropy estimator generates entropic peaks for comparing and classifying video and audio applications. The authors provide valuable insights on the use of entropy estimator fingerprints for network intrusion detection, admission control of application flows, and strategic traffic scheduling based on the available bandwidth. Although the study only looked at the effects of packet timing on traffic flows and not packet size, all current and future network engineers should find this incredible paper interesting. Online Computing Reviews Service

    Access critical reviews of Computing literature here

    Become a reviewer for Computing Reviews.

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Transactions on Modeling and Computer Simulation
    ACM Transactions on Modeling and Computer Simulation  Volume 23, Issue 2
    May 2013
    92 pages
    ISSN:1049-3301
    EISSN:1558-1195
    DOI:10.1145/2457459
    Issue’s Table of Contents
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 10 May 2013
    Accepted: 01 October 2012
    Revised: 01 October 2011
    Received: 01 June 2011
    Published in TOMACS Volume 23, Issue 2

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Entropy estimator
    2. data analysis
    3. self-similarity
    4. statistics
    5. traffic complexity

    Qualifiers

    • Research-article
    • Research
    • Refereed

    Funding Sources

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)22
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 14 Sep 2024

    Other Metrics

    Citations

    Cited By

    View all

    View Options

    Get Access

    Login options

    Full Access

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media