skip to main content
10.1145/2018436.2018465acmconferencesArticle/Chapter ViewAbstractPublication PagescommConference Proceedingsconference-collections
research-article
Free access

Towards predictable datacenter networks

Published: 15 August 2011 Publication History

Abstract

The shared nature of the network in today's multi-tenant datacenters implies that network performance for tenants can vary significantly. This applies to both production datacenters and cloud environments. Network performance variability hurts application performance which makes tenant costs unpredictable and causes provider revenue loss. Motivated by these factors, this paper makes the case for extending the tenant-provider interface to explicitly account for the network. We argue this can be achieved by providing tenants with a virtual network connecting their compute instances. To this effect, the key contribution of this paper is the design of virtual network abstractions that capture the trade-off between the performance guarantees offered to tenants, their costs and the provider revenue.
To illustrate the feasibility of virtual networks, we develop Oktopus, a system that implements the proposed abstractions. Using realistic, large-scale simulations and an Oktopus deployment on a 25-node two-tier testbed, we demonstrate that the use of virtual networks yields significantly better and more predictable tenant performance. Further, using a simple pricing model, we find that the our abstractions can reduce tenant costs by up to 74% while maintaining provider revenue neutrality.

Supplementary Material

MP4 File (sigcomm_8_1.mp4)

References

[1]
Amazon EC2 Spot Instances. http://aws.amazon.com/ec2/spot-instances/.
[2]
Amazon Cluster Compute, Jan. 2011. http://aws.amazon.com/ec2/hpc-applications/.
[3]
Amazon's EC2 Generating 220M, Jan. 2011. http://bit.ly/8rZdu.
[4]
Traffic Control API, Jan. 2011. http://msdn.microsoft.com/en-us/library/aa374468%28v=VS.85%29.aspx.
[5]
M. Al-Fares, A. Loukissas, and A. Vahdat. A scalable, commodity data center network architecture. In Proc. of ACM SIGCOMM, 2008.
[6]
M. Al-Fares, S. Radhakrishnan, B. Raghavan, N. Huang, and A. Vahdat. Hedera: Dynamic Flow Scheduling for Data Center Networks. In Proc. of USENIX NSDI, 2010.
[7]
G. Ananthanarayanan, S. Kandula, A. Greenberg, I. Stoica, Y. Lu, B. Saha, and E. Harris. Reining in the Outliers in Map-Reduce Clusters using Mantri. In Proc. of USENIX OSDI, 2010.
[8]
H. Ballani, P. Costa, T. Karagiannis, and A. Rowstron. Towards Predictable Datacenter Networks. Technical Report MSR-TR-2011-72, Microsoft Research, May 2011.
[9]
R. Black, A. Donnelly, and C. Fournet. Ethernet Topology Discovery without Network Assistance. In Proc. of ICNP, 2004.
[10]
B. Craybrook. Comparing cloud risks and virtualization risks for data center apps, 2011. http://bit.ly/fKjwzW.
[11]
J. Duato, S. Yalamanchili, and L. Ni. Interconnection Networks: An Engineering Approach. Elsevier, 2003.
[12]
N. G. Duffield, P. Goyal, A. Greenberg, P. Mishra, K. K. Ramakrishnan, and J. E. van der Merive. A flexible model for resource management in virtual private networks. In Proc. of ACM SIGCOMM, 1999.
[13]
A. Giurgiu. Network performance in virtual infrastrucures, Feb. 2010. http://staff.science.uva.nl/ delaat/sne-2009-2010/p29/presentation.pdf.
[14]
A. Greenberg, J. R. Hamilton, N. Jain, S. Kandula, C. Kim, P. Lahiri, D. A. Maltz, P. Patel, and S. Sengupta. VL2: a scalable and flexible data center network. In Proc. of ACM SIGCOMM, 2009.
[15]
C. Guo, G. Lu, H. J. Wang, S. Yang, C. Kong, P. Sun, W. Wu, and Y. Zhang. SecondNet: A Data Center Network Virtualization Architecture with Bandwidth Guarantees. In Proc. of ACM CoNext, 2010.
[16]
M. Hajjat, X. Sun, Y.-W. E. Sung, D. Maltz, S. Rao, K. Sripanidkulchai, and M. Tawarmalani. Cloudward bound: Planning for beneficial migration of enterprise applications to the cloud. In Proc. of SIGCOMM, 2010.
[17]
Q. He, S. Zhou, B. Kobler, D. Duffy, and T. McGlynn. Case study for running HPC applications in public clouds. In Proc. of ACM Symposium on High Performance Distributed Computing, 2010.
[18]
A. Iosup, N. Yigitbasi, and D. Epema. On the Performance Variability of Production Cloud Services. Technical Report PDS-2010-002, Delft University of Technology, Jan. 2010.
[19]
S. Kandula, J. Padhye, and P. Bahl. Flyways To Decongest Data Center Networks. In Proc. of HotNets, 2005.
[20]
S. Kandula, S. Sengupta, A. Greenberg, P. Patel, and R. Chaiken. The Nature of Data Center Traffic: Measurements & Analysis. In Proc. of ACM IMC, 2009.
[21]
D. Kossmann, T. Kraska, and S. Loesing. An Evaluation of Alternative Architectures for Transaction Processing in the Cloud. In Proc. of international conference on Management of data (SIGMOD), 2010.
[22]
T. Lam, S. Radhakrishnan, A. Vahdat, and G. Varghese. NetShare: Virtualizing Data Center Networks across Services. Technical Report CS2010-0957, University of California, San Deigo, May 2010.
[23]
A. Li, X. Yang, S. Kandula, and M. Zhang. CloudCmp: comparing public cloud providers. In Proc. of conference on Internet measurement (IMC), 2010.
[24]
D. Mangot. Measuring EC2 system performance, May 2009. http://bit.ly/48Wui.
[25]
X. Meng, V. Pappas, and L. Zhang. Improving the Scalability of Data Center Networks with Traffic-aware Virtual Machine Placement. In Proc. of Infocom, 2010.
[26]
Michael Armburst et al. Above the Clouds: A Berkeley View of Cloud Computing. Technical report, University of California, Berkeley, 2009.
[27]
J. Mudigonda, P. Yalagandula, M. Al-Fares, and J. Mogul. SPAIN: COTS Data-Center Ethernet for Multipathing over Arbitrary Topologies. In Proc of NSDI, 2010.
[28]
B. Raghavan, K. Vishwanath, S. Ramabhadran, K. Yocum, and A. C. Snoeren. Cloud control with distributed rate limiting. In Proc. of ACM SIGCOMM, 2007.
[29]
R. Ricci, C. Alfeld, and J. Lepreau. A Solver for the Network Testbed Mapping problem. SIGCOMM CCR, 33, 2003.
[30]
J. Schad, J. Dittrich, and J.-A. Quiané-Ruiz. Runtime measurements in the cloud: observing, analyzing, and reducing variance. In Proc. of VLDB, 2010.
[31]
A. Shieh, S. Kandula, A. Greenberg, and C. Kim. Sharing the Datacenter Network. In Proc. of USENIX NSDI, 2011.
[32]
P. Soares, J. Santos, N. Tolia, and D. Guedes. Gatekeeper: Distributed Rate Control for Virtualized Datacenters. Technical Report HP-2010-151, HP Labs, 2010.
[33]
E. Walker. Benchmarking Amazon EC2 for high-performance scientific computing. Usenix Login, 2008.
[34]
G. Wang, D. G. Andersen, M. Kaminsky, K. Papagiannaki, T. S. E. Ng, M. Kozuch, and M. Ryan. c-Through: Part-time Optics in Data Centers. In Proc. of ACM SIGCOMM, 2010.
[35]
G. Wang and T. S. E. Ng. The Impact of Virtualization on Network Performance of Amazon EC2 Data Center. In Proc. of IEEE Infocom, 2010.
[36]
H. Wang, Q. Jing, S. Jiao, R. Chen, B. He, Z. Qian, and L. Zhou. Distributed Systems Meet Economics: Pricing in the Cloud. In Proc. of USENIX HotCloud, 2010.
[37]
M. Yu, Y. Yi, J. Rexford, and M. Chiang. Rethinking Virtual Network Embedding: substrate support for path splitting and migration. SIGCOMM CCR, 38, 2008.
[38]
M. Zaharia, A. Konwinski, A. D. Joseph, Y. Katz, and I. Stoica. Improving MapReduce Performance in Heterogeneous Environments. In Proc. of OSDI, 2008.

Cited By

View all

Index Terms

  1. Towards predictable datacenter networks

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    SIGCOMM '11: Proceedings of the ACM SIGCOMM 2011 conference
    August 2011
    502 pages
    ISBN:9781450307970
    DOI:10.1145/2018436
    • cover image ACM SIGCOMM Computer Communication Review
      ACM SIGCOMM Computer Communication Review  Volume 41, Issue 4
      SIGCOMM '11
      August 2011
      480 pages
      ISSN:0146-4833
      DOI:10.1145/2043164
      Issue’s Table of Contents
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 15 August 2011

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. allocation
    2. bandwidth
    3. datacenter
    4. virtual network

    Qualifiers

    • Research-article

    Conference

    SIGCOMM '11
    Sponsor:
    SIGCOMM '11: ACM SIGCOMM 2011 Conference
    August 15 - 19, 2011
    Ontario, Toronto, Canada

    Acceptance Rates

    SIGCOMM '11 Paper Acceptance Rate 32 of 223 submissions, 14%;
    Overall Acceptance Rate 462 of 3,389 submissions, 14%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)295
    • Downloads (Last 6 weeks)48
    Reflects downloads up to 26 Jan 2025

    Other Metrics

    Citations

    Cited By

    View all

    View Options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Login options

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media