skip to main content
10.1145/3656019.3689610acmconferencesArticle/Chapter ViewAbstractPublication PagespactConference Proceedingsconference-collections
research-article
Open access

vSPACE: Supporting Parallel Network Packet Processing in Virtualized Environments through Dynamic Core Management

Published: 13 October 2024 Publication History

Abstract

Data centers face significant performance challenges with parallel processing for network I/O in virtualized environments, particularly for latency-critical (LC) workloads that must satisfy strict Service Level Objectives (SLOs). While previous studies have addressed performance challenges in network I/O virtualization, they overlook the impact of excessive parallelism on the performance of Virtual Machines (VMs). We observe that excessive parallelization for VMs and network I/O processing can lead to core oversubscription, resulting in significant resource contention, frequent preemptions, and task migrations. Based on these observations, we propose vSPACE, dynamic core management specifically designed to support parallel network I/O processing in virtualized environments efficiently. To reduce scheduling contention, vSPACE creates distinct core allocation groups for VM and network I/O and assigns dedicated cores to each. Then, it dynamically adjusts the number of allocated cores to enforce appropriate parallelism for VMs and network I/O processing based on varying demands. vSPACE employs continuous monitoring and a heuristic algorithm to periodically determine appropriate core allocation, addressing excessive contention and improving energy and resource efficiency. vSPACE operates in three modes: performance improvement, energy efficiency, and resource efficiency. Our evaluations demonstrate that vSPACE significantly enhances throughput by up to 4.2 × compared to existing core allocation approaches and improves energy and resource efficiency by up to 16.5% and 30.5%, respectively.

References

[1]
Jeongseob Ahn, Chang Hyun Park, Taekyung Heo, and Jaehyuk Huh. 2018. Accelerating critical OS services in virtualized systems with flexible micro-sliced cores. In European Conference on Computer Systems (EuroSys). 1–14.
[2]
Amazon. [n. d.]. Amazon EC2 Instance Types.https://aws.amazon.com/ec2/instance-types/#instance-details.
[3]
Esmail Asyabi, Azer Bestavros, Erfan Sharafzadeh, and Timothy Zhu. 2020. Peafowl: in-application CPU scheduling to reduce power consumption of in-memory key-value stores. In ACM Symposium on Cloud Computing (SoCC). 150–164.
[4]
Berk Atikoglu, Yuehai Xu, Eitan Frachtenberg, Song Jiang, and Mike Paleczny. 2012. Workload analysis of a large-scale key-value store. In ACM SIGMETRICS/PERFORMANCE Joint International Conference on Measurement and Modeling of Computer Systems. 53–64.
[5]
AWS. 2024. Enable enhanced networking with the Intel 82599 VF interface on Linux instances. [Online]. Available: https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/sriov-networking.html.
[6]
Christian Bienia, Sanjeev Kumar, Jaswinder Pal Singh, and Kai Li. 2008. The PARSEC benchmark suite: Characterization and architectural implications. In Proceedings of the 17th international conference on Parallel architectures and compilation techniques. 72–81.
[7]
Shuang Chen, Christina Delimitrou, and José F Martínez. 2019. PARTIES: QoS-Aware Resource Partitioning for Multiple Interactive Services. In ACM International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS). 107–120.
[8]
Luwei Cheng and Francis CM Lau. 2016. Offloading interrupt load balancing from smp virtual machines to the hypervisor. IEEE Transactions on Parallel and Distributed Systems (TPDS) 27, 11 (2016), 3298–3310.
[9]
Luwei Cheng, Jia Rao, and Francis CM Lau. 2016. vscale: Automatic and efficient processor scaling for smp virtual machines. In ACM European Conference on Computer Systems (EuroSys). 1–14.
[10]
Luwei Cheng and Cho-Li Wang. 2012. vBalance: using interrupt load balance to improve I/O performance for SMP virtual machines. In ACM Symposium on Cloud Computing (SoCC). 1–14.
[11]
Yaozu Dong, Dongxiao Xu, Yang Zhang, and Guangdeng Liao. 2011. Optimizing network I/O virtualization with efficient interrupt coalescing and virtual receive side scaling. In IEEE International Conference on Cluster Computing (CLUSTER). 26–34.
[12]
Yaozu Dong, Xiaowei Yang, Jianhui Li, Guangdeng Liao, Kun Tian, and Haibing Guan. 2012. High performance network virtualization with SR-IOV. J. Parallel and Distrib. Comput. 72, 11 (2012), 1471–1480.
[13]
Google. [n. d.]. Google Cloud Virtual Machine Types.https://cloud.google.com/compute/docs/machine-types.
[14]
HaiBing Guan, YaoZu Dong, RuHui Ma, Dongxiao Xu, Yang Zhang, and Jian Li. 2012. Performance enhancement for network I/O virtualization with efficient interrupt coalescing and virtual receive-side scaling. IEEE Transactions on Parallel and Distributed Systems (TPDS) 24, 6 (2012), 1118–1128.
[15]
XiaoKang Hu, Jian Li, Ruhui Ma, and Haibing Guan. 2020. ES2: Building an Efficient and Responsive Event Path for I/O Virtualization. IEEE Transactions on Cloud Computing (TCC) (2020).
[16]
Ki-Dong Kang, Gyeongseo Park, Hyosang Kim, Mohammad Alian, Nam Sung Kim, and Daehoon Kim. 2021. NMAP: Power Management Based on Network Packet Processing Mode Transition for Latency-Critical Workloads. In IEEE/ACM International Symposium on Microarchitecture (MICRO). 143–154.
[17]
Sanidhya Kashyap, Changwoo Min, and Taesoo Kim. 2018. Scaling Guest { OS} Critical Sections with { eCS}. In USENIX Annual Technical Conference (ATC). 159–172.
[18]
Jian Li, Ruhui Ma, HaiBing Guan, and David SL Wei. 2015. vINT: Hardware-assisted virtual interrupt remapping for SMP VM with scheduling awareness. In IEEE International Conference on Cloud Computing Technology and Science (CloudCom). 234–241.
[19]
Jian Li, Shuai Xue, Wang Zhang, Ruhui Ma, Zhengwei Qi, and Haibing Guan. 2017. When I/O interrupt becomes system bottleneck: Efficiency and scalability enhancement for SR-IOV network virtualization. IEEE Transactions on Cloud Computing 7, 4 (2017), 1183–1196.
[20]
David Lo, Liqun Cheng, Rama Govindaraju, Luiz André Barroso, and Christos Kozyrakis. 2014. Towards energy proportionality for large-scale latency-critical workloads. In ACM/IEEE International Symposium on Computer Architecture (ISCA). 301–312.
[21]
David Lo, Liqun Cheng, Rama Govindaraju, Parthasarathy Ranganathan, and Christos Kozyrakis. 2015. Heracles: Improving resource efficiency at scale. In ACM SIGARCH Computer Architecture News, Vol. 43. 450–462.
[22]
[22] Mellanox. 2018. http://www.mellanox.com/page/ products_dyn?product_family=275&mtag=bluefield_smart_nic.
[23]
Mellanox Mellanox. 2020. Mellanox ConnectX-5 product brief. [Online]. Available: https://www.mellanox.com/related-docs/prod_adapter_cards/PB_ConnectX-5_EN_Card.pdf.
[24]
Memcached. [n. d.]. https://memcached.org. Accessed on 04/30/2019.https://memcached.org.
[25]
Microsoft. 2023. Overview of Virtual Machine Multiple Queues (VMMQ). [Online]. Available: https://learn.microsoft.com/en-us/windows-hardware/drivers/network/overview-of-virtual-machine-multiple-queues.
[26]
Rajiv Nishtala, Paul Carpenter, Vinicius Petrucci, and Xavier Martorell. 2017. Hipster: Hybrid task manager for latency-critical cloud workloads. In IEEE International Symposium on High Performance Computer Architecture (HPCA). 409–420.
[27]
Rajiv Nishtala, Vinicius Petrucci, Paul Carpenter, and Magnus Sjalander. 2020. Twig: Multi-agent task management for colocated latency-critical cloud services. In IEEE International Symposium on High Performance Computer Architecture (HPCA). 167–179.
[28]
Gyeongseo Park, Ki-Dong Kang, Minho Kim, and Daehoon Kim. 2022. CoreNap: Energy Efficient Core Allocation for Latency-Critical Workloads. IEEE Computer Architecture Letters (CAL) 22, 1 (2022), 1–4.
[29]
Tirthak Patel and Devesh Tiwari. 2020. Clite: Efficient and qos-aware co-location of multiple latency-critical jobs for warehouse scale computers. In IEEE International Symposium on High Performance Computer Architecture (HPCA). 193–206.
[30]
Henry Qin, Qian Li, Jacqueline Speiser, Peter Kraft, and John Ousterhout. 2018. Arachne: core-aware thread management. In USENIX Symposium on Operating Systems Design and Implementation (OSDI). 145–160.
[31]
Will Reese. 2008. Nginx: the high-performance web server and reverse proxy. Linux Journal 2008, 173 (2008), 2.
[32]
Rusty Russell. 2008. virtio: towards a de-facto standard for virtual I/O devices. ACM SIGOPS Operating Systems Review 42, 5 (2008), 95–103.
[33]
Stijn Schildermans, Jianchen Shan, Kris Aerts, Jason Jackrel, and Xiaoning Ding. 2021. Virtualization overhead of multithreading in X86 state-of-the-art & remaining challenges. IEEE Transactions on Parallel and Distributed Systems (TPDS) 32, 10 (2021), 2557–2570.
[34]
Wenda Tang, Yutao Ke, Senbo Fu, Hongliang Jiang, Junjie Wu, Qian Peng, and Feng Gao. 2022. Demeter: QoS-aware CPU scheduling to reduce power consumption of multiple black-box workloads. In ACM Symposium on Cloud Computing (SoCC). 31–46.
[35]
Willem de Bruijn Tom Herbert. [n. d.]. Scaling in the Linux Networking Stack. https://static.lwn.net/kerneldoc/networking/scaling.html
[36]
Jason Wang. [n. d.]. Multiqueue virtio-net. https://www.linux-kvm.org/page/Multiqueue
[37]
Cong Xu, Sahan Gamage, Hui Lu, Ramana Kompella, and Dongyan Xu. 2013. { vTurbo} : Accelerating Virtual Machine { I/O} Processing Using Designated { Turbo-Sliced} Core. In USENIX Annual Technical Conference (ATC). 243–254.
[38]
Xin Zhan, Reza Azimi, Svilen Kanev, David Brooks, and Sherief Reda. 2016. Carb: A c-state power management arbiter for latency-critical workloads. IEEE Computer Architecture Letters 16, 1 (2016), 6–9.
[39]
Wang Zhang, Xiaokang Hu, Jian Li, and Haibing Guan. 2018. CoINT: Proactive Coordinator for Avoiding Interruptability Holder Preemption Problem in VSMP Environment. In IEEE International Conference on Computer Communications (INFOCOM). 477–485.
[40]
Laiping Zhao, Yanan Yang, Kaixuan Zhang, Xiaobo Zhou, Tie Qiu, Keqiu Li, and Yungang Bao. 2020. Rhythm: component-distinguishable workload deployment in datacenters. In ACM European Conference on Computer Systems (EuroSys). 1–17.
[41]
Yibo Zhu, Haggai Eran, Daniel Firestone, Chuanxiong Guo, Marina Lipshteyn, Yehonatan Liron, Jitendra Padhye, Shachar Raindel, Mohamad Haj Yahia, and Ming Zhang. 2015. Congestion control for large-scale RDMA deployments. ACM SIGCOMM Computer Communication Review 45, 4 (2015), 523–536.

Index Terms

  1. vSPACE: Supporting Parallel Network Packet Processing in Virtualized Environments through Dynamic Core Management
            Index terms have been assigned to the content through auto-classification.

            Recommendations

            Comments

            Information & Contributors

            Information

            Published In

            cover image ACM Conferences
            PACT '24: Proceedings of the 2024 International Conference on Parallel Architectures and Compilation Techniques
            October 2024
            375 pages
            ISBN:9798400706318
            DOI:10.1145/3656019
            This work is licensed under a Creative Commons Attribution International 4.0 License.

            Sponsors

            Publisher

            Association for Computing Machinery

            New York, NY, United States

            Publication History

            Published: 13 October 2024

            Check for updates

            Qualifiers

            • Research-article
            • Research
            • Refereed limited

            Funding Sources

            Conference

            PACT '24
            Sponsor:

            Acceptance Rates

            Overall Acceptance Rate 121 of 471 submissions, 26%

            Contributors

            Other Metrics

            Bibliometrics & Citations

            Bibliometrics

            Article Metrics

            • 0
              Total Citations
            • 177
              Total Downloads
            • Downloads (Last 12 months)177
            • Downloads (Last 6 weeks)48
            Reflects downloads up to 29 Jan 2025

            Other Metrics

            Citations

            View Options

            View options

            PDF

            View or Download as a PDF file.

            PDF

            eReader

            View online with eReader.

            eReader

            HTML Format

            View this article in HTML Format.

            HTML Format

            Login options

            Figures

            Tables

            Media

            Share

            Share

            Share this Publication link

            Share on social media