skip to main content
10.1145/3466752.3480089acmconferencesArticle/Chapter ViewAbstractPublication PagesmicroConference Proceedingsconference-collections
research-article
Public Access

GreenDIMM: OS-assisted DRAM Power Management for DRAM with a Sub-array Granularity Power-Down State

Published: 17 October 2021 Publication History

Abstract

Power and energy consumed by DRAM comprising main memory of data-center servers have increased substantially as the capacity and bandwidth of memory increase. Especially, the fraction of DRAM background power in DRAM total power is already high, and it will continue to increase with the decelerating DRAM technology scaling as we will have to plug more DRAM modules in servers or stack more DRAM dies in a DRAM package to provide necessary DRAM capacity in the future. To reduce the background power, we may exploit low average utilization of the DRAM capacity in data-center servers (i.e., 40–60%) for DRAM power management. Nonetheless, the current DRAM power management supports low-power states only at the rank granularity, which becomes ineffective with memory interleaving techniques devised to disperse memory requests across ranks. That is, ranks need to be frequently woken up from low-power states with aggressive power management, which can significantly degrade system performance, or they do not get a chance to enter low-power states with conservative power management.
To tackle such limitations of the current DRAM power management, we propose GreenDIMM, OS-assisted DRAM power management. Specifically, GreenDIMM first takes a memory block in physical address space mapped to a group of DRAM sub-arrays across every channel, rank, and bank as a unit of DRAM power management. This facilitates fine-grained DRAM power management while keeping the benefit of memory interleaving techniques. Second, GreenDIMM exploits memory on-/off-lining operations of the modern OS to dynamically remove/add memory blocks from/to the physical address space, depending on the utilization of memory capacity at run-time. Third, GreenDIMM implements a deep power-down state at the sub-array granularity to reduce the background power of the off-lined memory blocks. As the off-lined memory blocks are removed from the physical address space, the sub-arrays will not receive any memory request and stay in the power-down state until the memory blocks are explicitly on-lined by the OS. Our evaluation with a commercial server running diverse workloads shows that GreenDIMM can reduce DRAM and system power by 36% and 20%, respectively, with ∼ 1% performance degradation.

References

[1]
Sep 2012. (Accessed: April 2021). Main Memory: DDR4 & DDR5 SDRAM. https://www.jedec.org/category/technology-focus-area/main-memory-ddr3-ddr4-sdram.
[2]
Andrea Arcangeli, Izik Eidus, and Chris Wright. 2009. Increasing memory density by using KSM. In Proceedings of the Ottawa Linux Symposium (OLS). 19–28.
[3]
Rajeev Balasubramonian, Andrew B Kahng, Naveen Muralimanohar, Ali Shafiee, and Vaishnav Srinivas. 2017. CACTI 7: New tools for interconnect exploration in innovative off-chip memories. ACM Transactions on Architecture and Code Optimization (TACO) 14, 2(2017), 1–25.
[4]
James Bucek, Klaus-Dieter Lange, and Jóakim v. Kistowski. 2018. SPEC CPU2017: Next-generation compute benchmark. In Companion of the 2018 ACM/SPEC International Conference on Performance Engineering (ICPE). 41–42.
[5]
Maxime Coquelin. Jan 2012. (Accessed: April 2021). PASR: Partial Array Self-Refresh Framework. LWN. https://lwn.net/Articles/478049/.
[6]
Eli Cortez, Anand Bonde, Alexandre Muzio, Mark Russinovich, Marcus Fontoura, and Ricardo Bianchini. 2017. Resource central: Understanding and predicting workloads for improved resource management in large cloud platforms. In Proceedings of the 26th Symposium on Operating Systems Principles (SOSP). 153–167.
[7]
Howard David, Chris Fallin, Eugene Gorbatov, Ulf R Hanebutte, and Onur Mutlu. 2011. Memory power management via dynamic voltage/frequency scaling. In Proceedings of the 8th ACM international conference on Autonomic computing (ICAC). 31–40.
[8]
Howard David, Eugene Gorbatov, Ulf R Hanebutte, Rahul Khanna, and Christian Le. 2010. RAPL: Memory power estimation and capping. In 2010 ACM/IEEE International Symposium on Low-Power Electronics and Design (ISLPED). 189–194.
[9]
Victor De La Luz, Mahmut Kandemir, and Ibrahim Kolcu. 2002. Automatic data migration for reducing energy consumption in multi-bank memory systems. In IEEE Design Automation Conference (DAC). 213–218.
[10]
Elpida Elpida. 2005. Partial Array Self Refresh (PASR). https://media-www.micron.com/-/media/client/global/documents/products/technical-note/dram/e0597e10.pdf?rev=07992f36c55f4e7e8b0c9aaafcda90dd.
[11]
Xiaobo Fan, Carla Ellis, and Alvin Lebeck. 2001. Memory controller policies for DRAM power management. In International Symposium on Low Power Electronics and Design (ISLPED). 129–134.
[12]
Michael Ferdman, Almutaz Adileh, Onur Kocberber, Stavros Volos, Mohammad Alisafaee, Djordje Jevdjic, Cansu Kaynak, Adrian Daniel Popescu, Anastasia Ailamaki, and Babak Falsafi. 2012. Clearing the clouds: a study of emerging scale-out workloads on modern hardware. In ACM International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS). 37–48.
[13]
Dave Hansen. Apr 2003. (Accessed: April 2021). meminfo documentation. LWN. https://lwn.net/Articles/28309/.
[14]
John L Henning. 2006. SPEC CPU2006 benchmark descriptions. ACM SIGARCH Computer Architecture News(2006), 1–17.
[15]
Shengsheng Huang, Jie Huang, Jinquan Dai, Tao Xie, and Bo Huang. 2010. The HiBench benchmark suite: Characterization of the MapReduce-based data analysis. In 2010 IEEE 26th International Conference on Data Engineering Workshops (ICDEW). IEEE, 41–51.
[16]
Ciji Isen and Lizy John. 2009. ESKIMO - Energy Savings using Semantic Knowledge of Inconsequential Memory Occupancy for DRAM subsystem. In IEEE/ACM International Symposium on Microarchitecture (MICRO). 337–346.
[17]
Yasuaki Ishimatsu. May 29th, 2013. Memory Hotplug. https://www.fujitsu.com/jp/documents/products/software/os/linux/catalog/LinuxConJapan2013-Ishimatsu.pdf.
[18]
Brent Keeth, R Jacob Baker, Brian Johnson, and Feng Lin. 2007. DRAM circuit design: fundamental and high-speed topics. Vol. 13. John Wiley & Sons.
[19]
Yoongu Kim, Weikun Yang, and Onur Mutlu. 2015. Ramulator: A fast and extensible DRAM simulator. IEEE Computer architecture letters 15, 1 (2015), 45–49.
[20]
Karthik Kumar, Kshitij Doshi, Martin Dimitrov, and Yung-Hsiang Lu. 2011. Memory energy management for an enterprise decision support system. In IEEE/ACM International Symposium on Low Power Electronics and Design (ISLPED). 277–282.
[21]
Alvin R Lebeck, Xiaobo Fan, Heng Zeng, and Carla Ellis. 2012. Power aware page allocation. In ACM International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS). 105–116.
[22]
Wei-Fen Lin, Steven K Reinhardt, and Doug Burger. 2001. Reducing DRAM latencies with an integrated memory hierarchy design. In Proceedings HPCA Seventh International Symposium on High-Performance Computer Architecture (HPCA). 301–312.
[23]
Haikun Liu, Hai Jin, Xiaofei Liao, Wei Deng, Bingsheng He, and Cheng-zhong Xu. 2014. Hotplug or ballooning: A comparative study on dynamic memory management techniques for virtual machines. IEEE Transactions on parallel and distributed systems (TPDS) (2014), 1350–1363.
[24]
Jamie Liu, Ben Jaiyen, Richard Veras, and Onur Mutlu. 2012. RAIDR: Retention-aware intelligent DRAM refresh. In ACM/IEEE International Symposium on Computer Architecture (ISCA). 1–12.
[25]
Chengzhi Lu, Kejiang Ye, Guoyao Xu, Cheng-Zhong Xu, and Tongxin Bai. 2017. Imbalance in the cloud: An analysis on alibaba cluster trace. In 2017 IEEE International Conference on Big Data (Big Data). 2884–2892.
[26]
Haocong Luo, Taha Shahroodi, Hasan Hassan, Minesh Patel, Abdullah Giray Yaglikci, Lois Orosa, Jisung Park, and Onur Mutlu. 2020. CLR-DRAM: A Low-Cost DRAM Architecture Enabling Dynamic Capacity-Latency Trade-Off. (2020), 666–679.
[27]
Krishna T Malladi, Ian Shaeffer, Liji Gopalakrishnan, David Lo, Benjamin C Lee, and Mark Horowitz. 2012. Rethinking DRAM power modes for energy proportionality. In IEEE/ACM International Symposium on Microarchitecture (MICRO). 131–142.
[28]
David Meisner, Brian T Gold, and Thomas F Wenisch. 2009. PowerNap: eliminating server idle power. In ACM International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS). 205–216.
[29]
Mike O’Connor, Niladrish Chatterjee, Donghyuk Lee, John Wilson, Aditya Agrawal, Stephen W Keckler, and William J Dally. 2017. Fine-grained DRAM: energy-efficient DRAM for extreme bandwidth systems. In IEEE/ACM International Symposium on Microarchitecture (MICRO). 41–54.
[30]
Muhammad Tirmazi, Adam Barker, Nan Deng, Md E Haque, Zhijing Gene Qin, Steven Hand, Mor Harchol-Balter, and John Wilkes. 2020. Borg: the next generation. In Proceedings of the Fifteenth European Conference on Computer Systems (EuroSys). 1–14.
[31]
Konstantinos Tovletoglou, Lev Mukhanov, Dimitrios S Nikolopoulos, and Georgios Karakonstantis. 2020. HaRMony: Heterogeneous-Reliability Memory and QoS-Aware Energy Management on Virtualized Servers. In ACM International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS). 575–590.
[32]
Malcolm Ware, Karthick Rajamani, Michael Floyd, Bishop Brock, Juan C Rubio, Freeman Rawson, and John B Carter. 2010. Architecting for power management: The IBM® POWER7™ approach. In IEEE International Symposium on High-Performance Computer Architecture (HPCA). 1–11.
[33]
Donghong Wu, Bingsheng He, Xueyan Tang, Jianliang Xu, and Minyi Guo. 2012. RAMZzz: Rank-aware DRAM power management with dynamic migrations and demotions. In International Conference on High Performance Computing, Networking, Storage and Analysis (SC). 32:1–32:11.
[34]
Tao Zhang, Ke Chen, Cong Xu, Guangyu Sun, Tao Wang, and Yuan Xie. 2014. Half-DRAM: A high-bandwidth and low-power DRAM architecture from the rethinking of fine-grained activation. In ACM/IEEE International Symposium on Computer Architecture (ISCA). 349–360.
[35]
Pin Zhou, Vivek Pandey, Jagadeesan Sundaresan, Anand Raghuraman, Yuanyuan Zhou, and Sanjeev Kumar. 2004. Dynamic tracking of page miss ratio curve for memory management. In ACM International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS). 177–188.

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
MICRO '21: MICRO-54: 54th Annual IEEE/ACM International Symposium on Microarchitecture
October 2021
1322 pages
ISBN:9781450385572
DOI:10.1145/3466752
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 17 October 2021

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. DRAM power management
  2. memory off-lining

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Funding Sources

Conference

MICRO '21
Sponsor:

Acceptance Rates

Overall Acceptance Rate 484 of 2,242 submissions, 22%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)1,367
  • Downloads (Last 6 weeks)104
Reflects downloads up to 17 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2024)More is differentProceedings of the 2024 USENIX Conference on Usenix Annual Technical Conference10.5555/3691992.3692009(285-302)Online publication date: 10-Jul-2024
  • (2024)Online Container Scheduling With Fast Function Startup and Low Memory Cost in Edge ComputingIEEE Transactions on Computers10.1109/TC.2024.344183673:12(2747-2760)Online publication date: Dec-2024
  • (2024)MIMDRAM: An End-to-End Processing-Using-DRAM System for High-Throughput, Energy-Efficient and Programmer-Transparent Multiple-Instruction Multiple-Data Computing2024 IEEE International Symposium on High-Performance Computer Architecture (HPCA)10.1109/HPCA57654.2024.00024(186-203)Online publication date: 2-Mar-2024
  • (2022)CoolEdge: hotspot-relievable warm water cooling for energy-efficient edge datacentersProceedings of the 27th ACM International Conference on Architectural Support for Programming Languages and Operating Systems10.1145/3503222.3507713(814-829)Online publication date: 28-Feb-2022

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media