skip to main content
10.1145/3466752.3480098acmconferencesArticle/Chapter ViewAbstractPublication PagesmicroConference Proceedingsconference-collections
research-article
Public Access

NMAP: Power Management Based on Network Packet Processing Mode Transition for Latency-Critical Workloads

Published: 17 October 2021 Publication History

Abstract

Processor power management exploiting Dynamic Voltage and Frequency Scaling (DVFS) plays a crucial role in improving the data-center’s energy efficiency. However, we observe that current power management policies in Linux (i.e., governors) often considerably increase tail response time (i.e., violate a given Service Level Objective (SLO)) and energy consumption of latency-critical applications. Furthermore, the previously proposed SLO-aware power management policies oversimplify network request processing and ignore the fact that network requests arrive at the application layer in bursts. Considering the complex interplay between the OS and network devices, we propose a power management framework exploiting network packet processing mode transitions in the OS to quickly react to the processing demands from the received network requests. Our proposed power management framework tracks the transitions between polling and interrupt in the network software stack to detect excessive packet processing on the cores and immediately react to the load changes by updating the voltage and frequency (V/F) states. Our experimental results show that our framework does not violate SLO and reduces energy consumption by up to 35.7% and 14.8% compared to Linux governors and state-of-the-art SLO-aware power management techniques, respectively.

References

[1]
Mohammad Alian, Ahmed HMO Abulila, Lokesh Jindal, Daehoon Kim, and Nam Sung Kim. 2017. NCAP: Network-Driven, Packet Context-Aware Power Management for Client-Server Architecture. In IEEE International Symposium on High Performance Computer Architecture (HPCA). 25–36.
[2]
Esmail Asyabi, Azer Bestavros, Erfan Sharafzadeh, and Timothy Zhu. 2020. Peafowl: in-application CPU scheduling to reduce power consumption of in-memory key-value stores. In Proceedings of the 11th ACM Symposium on Cloud Computing (SoCC). 150–164.
[3]
Luiz Barroso, Mike Marty, David Patterson, and Parthasarathy Ranganathan. 2017. Attack of the killer microseconds. Commun. ACM 60, 4 (2017), 48–54.
[4]
Luiz André Barroso and Urs Hölzle. 2007. The case for energy-proportional computing. Computer12(2007), 33–37.
[5]
Adam Belay, George Prekas, Mia Primorac, Ana Klimovic, Samuel Grossman, Christos Kozyrakis, and Edouard Bugnion. 2016. The IX operating system: Combining low latency, high throughput, and efficiency in a protected dataplane. ACM Transactions on Computer Systems (TOCS) 34, 4 (2016), 1–39.
[6]
Dominik Brodowski and Nico Golde. 2002. CPU frequency and voltage scaling code in the Linux kernel. https://www.kernel.org/doc/Documentation/cpu-freq/governors.txt(2002).
[7]
Shuang Chen, Christina Delimitrou, and José F Martínez. 2019. Parties: Qos-aware resource partitioning for multiple interactive services. In Proceedings of the Twenty-Fourth International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS). 107–120.
[8]
Chih-Hsun Chou, Laxmi N Bhuyan, and Daniel Wong. 2019. μDPM: Dynamic power management for the microsecond era. In IEEE International Symposium on High Performance Computer Architecture (HPCA). 120–132.
[9]
Chih-Hsun Chou, Daniel Wong, and Laxmi N Bhuyan. 2016. Dynsleep: Fine-grained power management for a latency-critical data center application. In Proceedings of the International Symposium on Low Power Electronics and Design (ISLPED). 212–217.
[10]
Jeffrey Dean and Luiz André Barroso. 2013. The tail at scale. Commun. ACM 56, 2 (2013), 74–80.
[11]
Pierre Delforge. 2015. America’s Data Centers Consuming and Wasting Growing Amounts of Energy. [Online]. Available: https://www.nrdc.org/resources/americas-data-centers-consuming-and-wasting-growing-amounts-energy.
[12]
Christina Delimitrou and Christos Kozyrakis. 2014. Quasar: resource-efficient and QoS-aware cluster management. In Proceedings of the Nineteenth International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS). 127–144.
[13]
Stijn Eyerman and Lieven Eeckhout. 2010. A counter architecture for online DVFS profitability estimation. IEEE Trans. Comput. 59, 11 (2010), 1576–1583.
[14]
Brad Fitzpatrick. 2004. Distributed caching with memcached. Linux journal 124(2004).
[15]
Rong Ge, Xizhou Feng, and Kirk W Cameron. 2005. Improvement of power-performance efficiency for high-end computing. In 19th IEEE International Parallel and Distributed Processing Symposium (IPDPS). 8–pp.
[16]
Chang-Hong Hsu, Yunqi Zhang, Michael A Laurenzano, David Meisner, Thomas Wenisch, Jason Mars, Lingjia Tang, and Ronald G Dreslinski. 2015. Adrenaline: Pinpointing and reining in tail queries with quick voltage boosting. In IEEE International Symposium on High Performance Computer Architecture (HPCA). 271–282.
[17]
Intel Intel. 2014. 82599 10 GbE Controller Datasheet.
[18]
Intel Intel. 2016. Receive-Side Scaling (RSS). http://www.intel.com/content/ dam/support/us/en/documents/network/sb/318483001us2.pdf(2016).
[19]
Svilen Kanev, Kim Hazelwood, Gu-Yeon Wei, and David Brooks. 2014. Tradeoffs between power management and tail latency in warehouse-scale applications. In IEEE International Symposium on Workload Characterization (IISWC). 31–40.
[20]
Ki-Dong Kang, Hyungwon Park, Gyeongseo Park, and Daehoon Kim. 2020. Co-Adjusting Voltage/Frequency State and Interrupt Rate for Improving Energy-Efficiency of Latency-Critical Applications. IEEE Access 8(2020), 201028–201039.
[21]
Harshad Kasture, Davide B Bartolini, Nathan Beckmann, and Daniel Sanchez. 2015. Rubik: Fast analytical power management for latency-critical systems. In IEEE/ACM International Symposium on Microarchitecture (MICRO). 598–610.
[22]
Harshad Kasture and Daniel Sanchez. 2014. Ubik: efficient cache sharing with strict qos for latency-critical workloads. In Proceedings of the Nineteenth International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS). 729–742.
[23]
Georgios Keramidas, Vasileios Spiliopoulos, and Stefanos Kaxiras. 2010. Interval-based models for run-time DVFS orchestration in superscalar processors. In Proceedings of the 7th ACM international conference on Computing frontiers. 287–296.
[24]
Yanpei Liu, Stark C Draper, and Nam Sung Kim. 2014. Sleepscale: Runtime joint speed scaling and sleep states management for power efficient data centers. In ACM/IEEE International Symposium on Computer Architecture (ISCA). 313–324.
[25]
David Lo, Liqun Cheng, Rama Govindaraju, Luiz André Barroso, and Christos Kozyrakis. 2014. Towards energy proportionality for large-scale latency-critical workloads. In ACM/IEEE International Symposium on Computer Architecture (ISCA). 301–312.
[26]
David Lo, Liqun Cheng, Rama Govindaraju, Parthasarathy Ranganathan, and Christos Kozyrakis. 2015. Heracles: Improving resource efficiency at scale. In ACM/IEEE International Symposium on Computer Architecture (ISCA). 450–462.
[27]
Peter Macken, Marc Degrauwe, Mark Van Paemel, and Henri Oguey. 1990. A voltage reduction technique for digital systems. In In IEEE International Conference on Solid-State Circuits. 238–239.
[28]
Jason Mars, Lingjia Tang, Robert Hundt, Kevin Skadron, and Mary Lou Soffa. 2011. Bubble-up: Increasing utilization in modern warehouse scale computers via sensible co-locations. In IEEE/ACM International Symposium on Microarchitecture (MICRO). 248–259.
[29]
David Meisner, Brian T Gold, and Thomas F Wenisch. 2009. PowerNap: eliminating server idle power. In Proceedings of the Fourteenth International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS). 205–216.
[30]
David Meisner, Christopher M Sadler, Luiz André Barroso, Wolf-Dietrich Weber, and Thomas F Wenisch. 2011. Power management of online data-intensive services. In ACM/IEEE International Symposium on Computer Architecture (ISCA). 319–330.
[31]
David Meisner and Thomas F Wenisch. 2012. DreamWeaver: architectural support for deep sleep. In Proceedings of the Seventeenth International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS). 313–324.
[32]
Rustam Miftakhutdinov, Eiman Ebrahimi, and Yale N Patt. 2012. Predicting performance impact of DVFS for realistic memory systems. In IEEE/ACM International Symposium on Microarchitecture (MICRO). 155–165.
[33]
Jeffrey C Mogul and KK Ramakrishnan. 1997. Eliminating receive livelock in an interrupt-driven kernel. ACM Transactions on Computer Systems (TOCS) 15, 3 (1997), 217–252.
[34]
Rajiv Nishtala, Paul Carpenter, Vinicius Petrucci, and Xavier Martorell. 2017. Hipster: Hybrid task manager for latency-critical cloud workloads. In IEEE International Symposium on High Performance Computer Architecture (HPCA). 409–420.
[35]
Rajiv Nishtala, Vinicius Petrucci, Paul Carpenter, and Magnus Sjalander. 2020. Twig: Multi-Agent Task Management for Colocated Latency-Critical Cloud Services. In IEEE International Symposium on High Performance Computer Architecture (HPCA). 167–179.
[36]
Venkatesh Pallipadi, Shaohua Li, and Adam Belay. 2007. cpuidle: Do nothing, efficiently. In Proceedings of the Linux Symposium, Vol. 2. Citeseer, 119–125.
[37]
Venkatesh Pallipadi and Alexey Starikovskiy. 2006. The ondemand governor. In Proceedings of the Linux Symposium.
[38]
Will Reese. 2008. Nginx: the high-performance web server and reverse proxy. Linux Journal (2008), 2.
[39]
Jamal Hadi Salim, Robert Olsson, and Alexey Kuznetsov. 2001. Beyond Softnet. In Annual Linux Showcase & Conference, Vol. 5. 18–18.
[40]
Erfan Sharafzadeh, Seyed Alireza Sanaee Kohroudi, Esmail Asyabi, and Mohsen Sharifi. 2019. Yawn: A CPU Idle-state Governor for Datacenter Applications. In Proceedings of the 10th ACM SIGOPS Asia-Pacific Workshop on Systems (APSys). 91–98.
[41]
Balajee Vamanan, Hamza Bin Sohail, Jahangir Hasan, and TN Vijaykumar. 2015. Timetrader: Exploiting latency tail to save datacenter energy for online search. In IEEE/ACM International Symposium on Microarchitecture (MICRO). 585–597.
[42]
Qiang Wu, Vijay J Reddi, Youfeng Wu, Jin Lee, Dan Connors, David Brooks, Margaret Martonosi, and Douglas W Clark. 2005. A dynamic compilation framework for controlling microprocessor energy and performance. In IEEE/ACM International Symposium on Microarchitecture (MICRO). 271–282.
[43]
Hailong Yang, Alex Breslow, Jason Mars, and Lingjia Tang. 2013. Bubble-flux: Precise online qos management for increased utilization in warehouse scale computers. In ACM/IEEE International Symposium on Computer Architecture (ISCA). 607–618.
[44]
Hailong Yang, Quan Chen, Moeiz Riaz, Zhongzhi Luan, Lingjia Tang, and Jason Mars. 2017. Powerchief: Intelligent power allocation for multi-stage applications to improve responsiveness on power constrained cmp. In ACM/IEEE International Symposium on Computer Architecture (ISCA). 133–146.
[45]
Xin Zhan, Reza Azimi, Svilen Kanev, David Brooks, and Sherief Reda. 2016. Carb: A c-state power management arbiter for latency-critical workloads. IEEE Computer Architecture Letters 16, 1 (2016), 6–9.

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
MICRO '21: MICRO-54: 54th Annual IEEE/ACM International Symposium on Microarchitecture
October 2021
1322 pages
ISBN:9781450385572
DOI:10.1145/3466752
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 17 October 2021

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Data-center server
  2. Dynamic voltage and frequency scaling
  3. Power management
  4. Tail latency

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Funding Sources

Conference

MICRO '21
Sponsor:

Acceptance Rates

Overall Acceptance Rate 484 of 2,242 submissions, 22%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)504
  • Downloads (Last 6 weeks)67
Reflects downloads up to 17 Jan 2025

Other Metrics

Citations

Cited By

View all

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media