skip to main content
10.1145/3651890.3672235acmconferencesArticle/Chapter ViewAbstractPublication PagescommConference Proceedingsconference-collections
research-article

PPT: A Pragmatic Transport for Datacenters

Published: 04 August 2024 Publication History

Abstract

This paper introduces PPT, a pragmatic transport that achieves comparable performance to proactive transports while maintaining good deployability as reactive transports. Our key idea is to run a low-priority control loop to leverage the available bandwidth left by the reactive transports. The main challenge is to send just enough packets to improve performance without harming the primary control loop. We combine two unconventional techniques: an intermittent loop initialization and an exponential window decrease, enabling us to dynamically identify and fill the spare bandwidth. We further complement PPT's design with a buffer-aware flow scheduling scheme to optimize the average FCT of small flows without prior knowledge of flow size information. We have implemented a PPT prototype in the Linux kernel with ~400 lines of code and demonstrated that compared to Homa, it delivers up to 46.3% lower overall average FCT and even 25%/55.5% lower average/tail FCT of small flows in an Memcached workload.

References

[1]
ACID: Distributed Service Governance Framework. https://github.com/zavier-wong/acid.
[2]
DCTCP in Linux Kernel 3.18, 2014. https://kernelnewbies.org/Linux_3.18.
[3]
DCTCP in Windows Server, 2012. http://technet.microsoft.com/en-us/library/hh997028.aspx.
[4]
Memcached. http://memcached.org/.
[5]
Alizadeh, M., Greenberg, A., Maltz, D. A., Padhye, J., Patel, P., Prabhakar, B., Sengupta, S., and Sridharan, M. Data center tcp (dctcp). In SIGCOMM (2010).
[6]
Alizadeh, M., Kabbani, A., Edsall, T., Prabhakar, B., Vahdat, A., and Yasuda, M. Less Is More: Trading a Little Bandwidth for Ultra-Low Latency in the Data Center. In NSDI (2012).
[7]
Alizadeh, M., Yang, S., Sharif, M., Katti, S., McKeown, N., Prabhakar, B., and Shenker, S. pfabric: Minimal near-optimal datacenter transport. In SIGCOMM (2013).
[8]
Atikoglu, B., Xu, Y., Frachtenberg, E., Jiang, S., and Paleczny, M. Workload analysis of a large-scale key-value store. In Proceedings of ACM SIGMETRICS (2012).
[9]
Bai, W., Chen, L., Chen, K., Han, D., Tian, C., and Wang, H. Information-Agnostic Flow Scheduling for Commodity Data Centers. In NSDI (2015).
[10]
Bai, W., Hu, S., Chen, K., Tan, K., and Xiong, Y. One more config is enough: Saving (DC) TCP for high-speed extremely shallow-buffered datacenters. In INFOCOM (2020).
[11]
Cho, I., Jang, K., and Han, D. Credit-scheduled delay-bounded congestion control for datacenters. In SIGCOMM (2017).
[12]
Dukkipati, N., Refice, T., Cheng, Y., Chu, J., Herbert, T., Agarwal, A., Jain, A., and Sutin, N. An argument for increasing TCP's initial congestion window. ACM SIGCOMM Computer Communication Review 40, 3 (2010), 26--33.
[13]
Greenberg, A., Hamilton, J. R., Jain, N., Kandula, S., Kim, C., Lahiri, P., Maltz, D. A., Patel, P., and Sengupta, S. VL2: A scalable and flexible data center network. In SIGCOMM (2009).
[14]
Grossman, L. Large receive offload implementation in neterion 10gbe ethernet driver. In Linux Symposium (2005), p. 195.
[15]
Handley, M., Raiciu, C., Agache, A., Voinescu, A., Moore, A. W., Antichi, G., and Wójcik, M. Re-architecting datacenter networks and stacks for low latency and high performance. In SIGCOMM (2017).
[16]
Høiland-Jørgensen, T., Brouer, J. D., Borkmann, D., Fastabend, J., Herbert, T., Ahern, D., and Miller, D. The express data path: Fast programmable packet processing in the operating system kernel. In CoNEXT (2018).
[17]
Hu, S., Bai, W., Zeng, G., Wang, Z., Qiao, B., Chen, K., Tan, K., and Wang, Y. Aeolus: A Building Block for Proactive Transport in Datacenters. In SIGCOMM (2020).
[18]
Jorgensen, S., Holodnak, J., Dempsey, J., de Souza, K., Raghunath, A., Rivet, V., DeMoes, N., Alejos, A., and Wollaber, A. Extensible machine learning for encrypted network traffic application labeling via uncertainty quantification. IEEE Transactions on Artificial Intelligence (2023).
[19]
Judd, G. Attaining the Promise and Avoiding the Pitfalls of TCP in the Datacenter. In NSDI (2015).
[20]
Kaufmann, A., Stamler, T., Peter, S., Sharma, N. K., Krishnamurthy, A., and Anderson, T. TAS: TCP acceleration as an OS service. In EuroSys (2019).
[21]
Kumar, G., Dukkipati, N., Jang, K., Wassel, H. M., Wu, X., Montazeri, B., Wang, Y., Springborn, K., Alfeld, C., Ryan, M., et al. Swift: Delay is simple and effective for congestion control in the datacenter. In Proceedings of ACM SIGCOMM (2020).
[22]
Lee, C., Park, C., Jang, K., Moon, S., and Han, D. Accurate latency-based congestion feedback for datacenters. In ATC (2015).
[23]
Li, Q., Dong, M., and Godfrey, P. B. Halfback: Running short flows quickly and safely. In CoNEXT (2015).
[24]
Li, W., Xie, X., Liu, Y., Li, K., Chen, K., Ge, Z., Qi, H., Zhang, S., and Liu, G. Flow scheduling with imprecise knowledge. In Proceedings of USENIX NSDI (2024).
[25]
Li, Y., Miao, R., Liu, H. H., Zhuang, Y., Feng, F., Tang, L., Cao, Z., Zhang, M., Kelly, F., Alizadeh, M., et al. HPCC: High precision congestion control. In SIGCOMM (2019).
[26]
Liu, D., Allman, M., Jin, S., and Wang, L. Congestion control without a startup phase. In Proc. PFLDnet (2007).
[27]
McCanne, S., and Jacobson, V. The BSD Packet Filter: A New Architecture for User-level Packet Capture. In USENIX winter (1993), vol. 46.
[28]
Meng, T., Schiff, N. R., Godfrey, P. B., and Schapira, M. Pcc proteus: Scavenger transport and beyond. In Proceedings of the Annual conference of the ACM Special Interest Group on Data Communication on the applications, technologies, architectures, and protocols for computer communication (2020), pp. 615--631.
[29]
Mittal, R., Lam, V. T., Dukkipati, N., Blem, E., Wassel, H., Ghobadi, M., Vahdat, A., Wang, Y., Wetherall, D., and Zats, D. TIMELY: RTT-based congestion control for the datacenter. In SIGCOMM (2015).
[30]
Mittal, R., Sherry, J., Ratnasamy, S., and Shenker, S. Recursively cautious congestion control. In NSDI (2014).
[31]
Mohammad, A., Adel, J., and Balaji, P. Analysis of DCTCP. ACM SIGMETRICS Performance Evaluation Review (2011).
[32]
Montazeri, B., Li, Y., Alizadeh, M., and Ousterhout, J. Homa: A receiver-driven low-latency transport protocol using network priorities. In SIGCOMM (2018).
[33]
Ousterhout, J. A linux kernel implementation of the homa transport protocol. In 2021 USENIX Annual Technical Conference (USENIX ATC 21) (2021), pp. 99--115.
[34]
Roy, A., Zeng, H., Bagga, J., Porter, G., and Snoeren, A. C. Inside the social network's (datacenter) network. In SIGCOMM (2015).
[35]
Singh, A., Ong, J., Agarwal, A., Anderson, G., Armistead, A., Bannon, R., Boving, S., Desai, G., Felderman, B., Germano, P., et al. Jupiter rising: A decade of clos topologies and centralized control in google's datacenter network. In SIGCOMM (2015).
[36]
Vamanan, B., Hasan, J., and Vijaykumar, T. Deadline-aware datacenter tcp (d2tcp). In SIGCOMM (2012).
[37]
Vanini, E., Pan, R., Alizadeh, M., Taheri, P., and Edsall, T. Let it flow: Resilient asymmetric load balancing with flowlet switching. In NSDI (2017).
[38]
Wang, Z., Luo, L., Ning, Q., Zeng, C., Li, W., Wan, X., Xie, P., Feng, T., Cheng, K., Geng, X., et al. {SRNIC}: A scalable architecture for {RDMA}{NICs}. In Proceedings of USENIX NSDI (2023).
[39]
Wu, H., Feng, Z., Guo, C., and Zhang, Y. Ictcp: Incast congestion control for tcp in data-center networks. IEEE/ACM transactions on networking 21, 2 (2012), 345--358.
[40]
Wu, H., Ju, J., Lu, G., Guo, C., Xiong, Y., and Zhang, Y. Tuning ECN for data center networks. In CoNEXT (2012).
[41]
Yasukata, K., Honda, M., Santry, D., and Eggert, L. StackMap:Low-Latency Networking with the OS Stack and Dedicated NICs. In ATC (2016).
[42]
Zhao, Y., Saeed, A., Zegura, E., and Ammar, M. Zd: A scalable zero-drop network stack at end hosts. In Proceedings of the 15th International Conference on Emerging Networking Experiments And Technologies (New York, NY, USA, 2019), CoNEXT '19, Association for Computing Machinery, p. 220--232.
[43]
Zhu, Y., Eran, H., Firestone, D., Guo, C., Lipshteyn, M., Liron, Y., Padhye, J., Raindel, S., Yahia, M. H., and Zhang, M. Congestion control for large-scale RDMA deployments. In SIGCOMM (2015).

Index Terms

  1. PPT: A Pragmatic Transport for Datacenters

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Conferences
      ACM SIGCOMM '24: Proceedings of the ACM SIGCOMM 2024 Conference
      August 2024
      1033 pages
      ISBN:9798400706141
      DOI:10.1145/3651890
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

      Sponsors

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 04 August 2024

      Check for updates

      Author Tags

      1. data center networks
      2. pragmatic transport
      3. dual-loop rate control
      4. flow scheduling

      Qualifiers

      • Research-article

      Funding Sources

      • NSFC

      Conference

      ACM SIGCOMM '24
      Sponsor:
      ACM SIGCOMM '24: ACM SIGCOMM 2024 Conference
      August 4 - 8, 2024
      NSW, Sydney, Australia

      Acceptance Rates

      Overall Acceptance Rate 462 of 3,389 submissions, 14%

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • 0
        Total Citations
      • 480
        Total Downloads
      • Downloads (Last 12 months)480
      • Downloads (Last 6 weeks)104
      Reflects downloads up to 23 Dec 2024

      Other Metrics

      Citations

      View Options

      Login options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media