research-article

Open access

vSPACE: Supporting Parallel Network Packet Processing in Virtualized Environments through Dynamic Core Management

Authors:

Gyeongseo Park,

Yunhyeong Jeon,

Daehoon KimAuthors Info & Claims

PACT '24: Proceedings of the 2024 International Conference on Parallel Architectures and Compilation Techniques

Pages 14 - 25

https://doi.org/10.1145/3656019.3689610

Published: 13 October 2024 Publication History

All formats PDF

Abstract

Data centers face significant performance challenges with parallel processing for network I/O in virtualized environments, particularly for latency-critical (LC) workloads that must satisfy strict Service Level Objectives (SLOs). While previous studies have addressed performance challenges in network I/O virtualization, they overlook the impact of excessive parallelism on the performance of Virtual Machines (VMs). We observe that excessive parallelization for VMs and network I/O processing can lead to core oversubscription, resulting in significant resource contention, frequent preemptions, and task migrations. Based on these observations, we propose vSPACE, dynamic core management specifically designed to support parallel network I/O processing in virtualized environments efficiently. To reduce scheduling contention, vSPACE creates distinct core allocation groups for VM and network I/O and assigns dedicated cores to each. Then, it dynamically adjusts the number of allocated cores to enforce appropriate parallelism for VMs and network I/O processing based on varying demands. vSPACE employs continuous monitoring and a heuristic algorithm to periodically determine appropriate core allocation, addressing excessive contention and improving energy and resource efficiency. vSPACE operates in three modes: performance improvement, energy efficiency, and resource efficiency. Our evaluations demonstrate that vSPACE significantly enhances throughput by up to 4.2 × compared to existing core allocation approaches and improves energy and resource efficiency by up to 16.5% and 30.5%, respectively.

References

[1]

Jeongseob Ahn, Chang Hyun Park, Taekyung Heo, and Jaehyuk Huh. 2018. Accelerating critical OS services in virtualized systems with flexible micro-sliced cores. In European Conference on Computer Systems (EuroSys). 1–14.

Digital Library

[2]

Amazon. [n. d.]. Amazon EC2 Instance Types.https://aws.amazon.com/ec2/instance-types/#instance-details.

[3]

Esmail Asyabi, Azer Bestavros, Erfan Sharafzadeh, and Timothy Zhu. 2020. Peafowl: in-application CPU scheduling to reduce power consumption of in-memory key-value stores. In ACM Symposium on Cloud Computing (SoCC). 150–164.

Digital Library

[4]

Berk Atikoglu, Yuehai Xu, Eitan Frachtenberg, Song Jiang, and Mike Paleczny. 2012. Workload analysis of a large-scale key-value store. In ACM SIGMETRICS/PERFORMANCE Joint International Conference on Measurement and Modeling of Computer Systems. 53–64.

Digital Library

[5]

AWS. 2024. Enable enhanced networking with the Intel 82599 VF interface on Linux instances. [Online]. Available: https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/sriov-networking.html.

[6]

Christian Bienia, Sanjeev Kumar, Jaswinder Pal Singh, and Kai Li. 2008. The PARSEC benchmark suite: Characterization and architectural implications. In Proceedings of the 17th international conference on Parallel architectures and compilation techniques. 72–81.

Digital Library

[7]

Shuang Chen, Christina Delimitrou, and José F Martínez. 2019. PARTIES: QoS-Aware Resource Partitioning for Multiple Interactive Services. In ACM International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS). 107–120.

[8]

Luwei Cheng and Francis CM Lau. 2016. Offloading interrupt load balancing from smp virtual machines to the hypervisor. IEEE Transactions on Parallel and Distributed Systems (TPDS) 27, 11 (2016), 3298–3310.

Digital Library

[9]

Luwei Cheng, Jia Rao, and Francis CM Lau. 2016. vscale: Automatic and efficient processor scaling for smp virtual machines. In ACM European Conference on Computer Systems (EuroSys). 1–14.

Digital Library

[10]

Luwei Cheng and Cho-Li Wang. 2012. vBalance: using interrupt load balance to improve I/O performance for SMP virtual machines. In ACM Symposium on Cloud Computing (SoCC). 1–14.

Digital Library

[11]

Yaozu Dong, Dongxiao Xu, Yang Zhang, and Guangdeng Liao. 2011. Optimizing network I/O virtualization with efficient interrupt coalescing and virtual receive side scaling. In IEEE International Conference on Cluster Computing (CLUSTER). 26–34.

Digital Library

[12]

Yaozu Dong, Xiaowei Yang, Jianhui Li, Guangdeng Liao, Kun Tian, and Haibing Guan. 2012. High performance network virtualization with SR-IOV. J. Parallel and Distrib. Comput. 72, 11 (2012), 1471–1480.

Digital Library

[13]

Google. [n. d.]. Google Cloud Virtual Machine Types.https://cloud.google.com/compute/docs/machine-types.

[14]

HaiBing Guan, YaoZu Dong, RuHui Ma, Dongxiao Xu, Yang Zhang, and Jian Li. 2012. Performance enhancement for network I/O virtualization with efficient interrupt coalescing and virtual receive-side scaling. IEEE Transactions on Parallel and Distributed Systems (TPDS) 24, 6 (2012), 1118–1128.

Digital Library

[15]

XiaoKang Hu, Jian Li, Ruhui Ma, and Haibing Guan. 2020. ES2: Building an Efficient and Responsive Event Path for I/O Virtualization. IEEE Transactions on Cloud Computing (TCC) (2020).

[16]

Ki-Dong Kang, Gyeongseo Park, Hyosang Kim, Mohammad Alian, Nam Sung Kim, and Daehoon Kim. 2021. NMAP: Power Management Based on Network Packet Processing Mode Transition for Latency-Critical Workloads. In IEEE/ACM International Symposium on Microarchitecture (MICRO). 143–154.

[17]

Sanidhya Kashyap, Changwoo Min, and Taesoo Kim. 2018. Scaling Guest { OS} Critical Sections with { eCS}. In USENIX Annual Technical Conference (ATC). 159–172.

[18]

Jian Li, Ruhui Ma, HaiBing Guan, and David SL Wei. 2015. vINT: Hardware-assisted virtual interrupt remapping for SMP VM with scheduling awareness. In IEEE International Conference on Cloud Computing Technology and Science (CloudCom). 234–241.

Digital Library

[19]

Jian Li, Shuai Xue, Wang Zhang, Ruhui Ma, Zhengwei Qi, and Haibing Guan. 2017. When I/O interrupt becomes system bottleneck: Efficiency and scalability enhancement for SR-IOV network virtualization. IEEE Transactions on Cloud Computing 7, 4 (2017), 1183–1196.

[20]

David Lo, Liqun Cheng, Rama Govindaraju, Luiz André Barroso, and Christos Kozyrakis. 2014. Towards energy proportionality for large-scale latency-critical workloads. In ACM/IEEE International Symposium on Computer Architecture (ISCA). 301–312.

[21]

David Lo, Liqun Cheng, Rama Govindaraju, Parthasarathy Ranganathan, and Christos Kozyrakis. 2015. Heracles: Improving resource efficiency at scale. In ACM SIGARCH Computer Architecture News, Vol. 43. 450–462.

[22]

[22] Mellanox. 2018. http://www.mellanox.com/page/ products_dyn?product_family=275&mtag=bluefield_smart_nic.

[23]

Mellanox Mellanox. 2020. Mellanox ConnectX-5 product brief. [Online]. Available: https://www.mellanox.com/related-docs/prod_adapter_cards/PB_ConnectX-5_EN_Card.pdf.

[24]

Memcached. [n. d.]. https://memcached.org. Accessed on 04/30/2019.https://memcached.org.

[25]

Microsoft. 2023. Overview of Virtual Machine Multiple Queues (VMMQ). [Online]. Available: https://learn.microsoft.com/en-us/windows-hardware/drivers/network/overview-of-virtual-machine-multiple-queues.

[26]

Rajiv Nishtala, Paul Carpenter, Vinicius Petrucci, and Xavier Martorell. 2017. Hipster: Hybrid task manager for latency-critical cloud workloads. In IEEE International Symposium on High Performance Computer Architecture (HPCA). 409–420.

[27]

Rajiv Nishtala, Vinicius Petrucci, Paul Carpenter, and Magnus Sjalander. 2020. Twig: Multi-agent task management for colocated latency-critical cloud services. In IEEE International Symposium on High Performance Computer Architecture (HPCA). 167–179.

[28]

Gyeongseo Park, Ki-Dong Kang, Minho Kim, and Daehoon Kim. 2022. CoreNap: Energy Efficient Core Allocation for Latency-Critical Workloads. IEEE Computer Architecture Letters (CAL) 22, 1 (2022), 1–4.

[29]

Tirthak Patel and Devesh Tiwari. 2020. Clite: Efficient and qos-aware co-location of multiple latency-critical jobs for warehouse scale computers. In IEEE International Symposium on High Performance Computer Architecture (HPCA). 193–206.

[30]

Henry Qin, Qian Li, Jacqueline Speiser, Peter Kraft, and John Ousterhout. 2018. Arachne: core-aware thread management. In USENIX Symposium on Operating Systems Design and Implementation (OSDI). 145–160.

[31]

Will Reese. 2008. Nginx: the high-performance web server and reverse proxy. Linux Journal 2008, 173 (2008), 2.

Digital Library

[32]

Rusty Russell. 2008. virtio: towards a de-facto standard for virtual I/O devices. ACM SIGOPS Operating Systems Review 42, 5 (2008), 95–103.

Digital Library

[33]

Stijn Schildermans, Jianchen Shan, Kris Aerts, Jason Jackrel, and Xiaoning Ding. 2021. Virtualization overhead of multithreading in X86 state-of-the-art & remaining challenges. IEEE Transactions on Parallel and Distributed Systems (TPDS) 32, 10 (2021), 2557–2570.

Digital Library

[34]

Wenda Tang, Yutao Ke, Senbo Fu, Hongliang Jiang, Junjie Wu, Qian Peng, and Feng Gao. 2022. Demeter: QoS-aware CPU scheduling to reduce power consumption of multiple black-box workloads. In ACM Symposium on Cloud Computing (SoCC). 31–46.

Digital Library

[35]

Willem de Bruijn Tom Herbert. [n. d.]. Scaling in the Linux Networking Stack. https://static.lwn.net/kerneldoc/networking/scaling.html

[36]

Jason Wang. [n. d.]. Multiqueue virtio-net. https://www.linux-kvm.org/page/Multiqueue

[37]

Cong Xu, Sahan Gamage, Hui Lu, Ramana Kompella, and Dongyan Xu. 2013. { vTurbo} : Accelerating Virtual Machine { I/O} Processing Using Designated { Turbo-Sliced} Core. In USENIX Annual Technical Conference (ATC). 243–254.

[38]

Xin Zhan, Reza Azimi, Svilen Kanev, David Brooks, and Sherief Reda. 2016. Carb: A c-state power management arbiter for latency-critical workloads. IEEE Computer Architecture Letters 16, 1 (2016), 6–9.

Digital Library

[39]

Wang Zhang, Xiaokang Hu, Jian Li, and Haibing Guan. 2018. CoINT: Proactive Coordinator for Avoiding Interruptability Holder Preemption Problem in VSMP Environment. In IEEE International Conference on Computer Communications (INFOCOM). 477–485.

[40]

Laiping Zhao, Yanan Yang, Kaixuan Zhang, Xiaobo Zhou, Tie Qiu, Keqiu Li, and Yungang Bao. 2020. Rhythm: component-distinguishable workload deployment in datacenters. In ACM European Conference on Computer Systems (EuroSys). 1–17.

Digital Library

[41]

Yibo Zhu, Haggai Eran, Daniel Firestone, Chuanxiong Guo, Marina Lipshteyn, Yehonatan Liron, Jitendra Padhye, Shachar Raindel, Mohamad Haj Yahia, and Ming Zhang. 2015. Congestion control for large-scale RDMA deployments. ACM SIGCOMM Computer Communication Review 45, 4 (2015), 523–536.

Digital Library

Index Terms

vSPACE: Supporting Parallel Network Packet Processing in Virtualized Environments through Dynamic Core Management
1. Networks
  1. Network algorithms
  2. Network services
    1. Cloud computing
    2. Programmable networks
2. Software and its engineering
  1. Software organization and properties
    1. Contextual software domains
      1. Operating systems
      2. Software infrastructure
        Virtual machines

Index terms have been assigned to the content through auto-classification.

Recommendations

Transparently bridging semantic gap in CPU management for virtualized environments

Consolidated environments are progressively accommodating diverse and unpredictable workloads in conjunction with virtual desktop infrastructure and cloud computing. Unpredictable workloads, however, aggravate the semantic gap between the virtual ...
Evaluating Network Stacks for the Virtualized Mobile Packet Core
APNet '21: Proceedings of the 5th Asia-Pacific Workshop on Networking

Several novel userspace network stacks have been proposed in recent research to overcome the limitations of the Linux network stack in providing high-performance I/O for Virtual Network Functions (VNFs). In this paper, we evaluate the performance of ...
Dynamic Processor Resource Configuration in Virtualized Environments
SCC '11: Proceedings of the 2011 IEEE International Conference on Services Computing

Virtualization can provide significant benefits in data centers, such as dynamic resource configuration, live virtual machine migration. Services are deployed in virtual machines (VMs) and resource utilization can be greatly improved. In this paper, we ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

PACT '24: Proceedings of the 2024 International Conference on Parallel Architectures and Compilation Techniques

October 2024

375 pages

ISBN:9798400706318

DOI:10.1145/3656019

Copyright © 2024 Owner/Author.

This work is licensed under a Creative Commons Attribution International 4.0 License.

Sponsors

SIGARCH: ACM Special Interest Group on Computer Architecture

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 13 October 2024

Check for updates

Qualifiers

Research-article
Research
Refereed limited

Funding Sources

Institute of Information & Communications Technology Planning & Evaluation
National Research Foundation of Korea

Conference

PACT '24

Sponsor:

SIGARCH

PACT '24: International Conference on Parallel Architectures and Compilation Techniques

October 14 - 16, 2024

CA, Long Beach, USA

Acceptance Rates

Overall Acceptance Rate 121 of 471 submissions, 26%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
177
Total Downloads

Downloads (Last 12 months)177
Downloads (Last 6 weeks)48

Reflects downloads up to 29 Jan 2025

Other Metrics

View Author Metrics

Citations

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Figures

Tables

Media

View Table of Conten