skip to main content
research-article

Dynamic Thermal Management of 3D Memory through Rotating Low Power States and Partial Channel Closure

Published: 09 November 2023 Publication History

Abstract

Modern high-performance and high-bandwidth three-dimensional (3D) memories are characterized by frequent heating. Prior art suggests turning off hot channels and migrating data to the background DDR memory, incurring significant performance and energy overheads. We propose three Dynamic Thermal Management (DTM) approaches for 3D memories, reducing these overheads. The first approach, Rotating-channel Low-power-state-based DTM (RL-DTM), minimizes the energy overheads by avoiding data migration. RL-DTM places 3D memory channels into low power states instead of turning them off. Since data accesses are disallowed during low power state, RL-DTM balances each channel’s low-power-state duration. The second approach, Masked rotating-channel Low-power-state-based DTM (ML-DTM), is a fine-grained policy that minimizes the energy-delay product (EDP) and improves the performance of RL-DTM by considering the channel access rate. The third strategy, Partial channel closure and ML-DTM, minimizes performance overheads of existing channel-level turn-off-based policies by closing a channel only partially and integrating ML-DTM, reducing the number of channels being turned off. We evaluate the proposed DTM policies using various mixes of SPEC benchmarks and multi-threaded workloads and observe them to significantly improve performance, energy, and EDP over state-of-the-art approaches for different 3D memory architectures.

References

[1]
Tosiron Adegbija and Ann Gordon-Ross. 2018. TaPT: Temperature-aware dynamic cache optimization for embedded systems. Computers 7, 1 (2018), 3.
[2]
ANSYS. 2013. ANSYS Icepak User’s Guide.
[3]
Manish Arora, Srilatha Manne, Indrani Paul, Nuwan Jayasena, and Dean M. Tullsen. 2015. Understanding idle behavior and power gating mechanisms in the context of modern benchmarks on CPU-GPU integrated systems. In Proceedings of the IEEE International Symposium on High-Performance Computer Architecture (HPCA’15). IEEE, 366–377.
[4]
Raid Ayoub, Rajib Nath, and Tajana Simunic Rosing. 2013. CoMETC: Coordinated management of energy/thermal/cooling in servers. ACM Transactions on Design Automation of Electronic Systems (TODAES) 19, 1 (2013), 1–28.
[5]
Raid Ayoub and Alex Orailoglu. 2010. Performance and energy efficient cache migration approach for thermal management in embedded systems. In Proceedings of the 20th Symposium on Great Lakes Symposium on VLSI (GLSVLSI’10). ACM, 365–368.
[6]
Peter Bailis, Vijay Janapa Reddi, Sanjay Gandhi, David Brooks, and Margo Seltzer. 2011. Dimetrodon: Processor-level preventive thermal management via idle cycle injection. In Proceedings of the Design Automation Conference (DAC’11). IEEE, 89–94.
[7]
Min Bao, Alexandru Andrei, Petru Eles, and Zebo Peng. 2009. On-line thermal aware dynamic voltage scaling for energy optimization with frequency/temperature dependency consideration. In Proceedings of the Design Automation Conference (DAC’09). IEEE, 490–495.
[8]
W. Lloyd Bircher and Lizy K. John. 2008. Analysis of dynamic power management on multi-core processors. In Proceedings of the 22nd Annual International Conference on Supercomputing (ICS’08). 327–338.
[9]
Paul Bogdan, Partha Pratim Pande, Hussam Amrouch, Muhammad Shafique, and Jörg Henkel. 2016. Power and thermal management in massive multicore chips: Theoretical foundation meets architectural innovation and resource allocation. In Proceedings of the International Conference on Compilers, Architectures and Synthesis for Embedded Systems (CASES’16). ACM, 4.
[10]
Daniel Calvo, Pablo González, Luís Díaz, Héctor Posadas, Pablo Sánchez, Eugenio Villar, Andrea Acquaviva, and Enrico Macii. 2011. A multi-processing systems-on-chip native simulation framework for power and thermal-aware design. Journal of Low Power Electronics (JOLPE) 7, 1 (2011), 2–16.
[11]
Chih-Hsun Chou and Laxmi N. Bhuyan. [n.d.]. A multicore vacation scheme for thermal-aware packet processing. In Proceedings of the IEEE International Conference on Computer Design 2015.
[12]
Chih-Hsun Chou, Laxmi N. Bhuyan, and Daniel Wong. 2019. \(\mu\)DPM: Dynamic power management for the microsecond era. In Proceedings of the IEEE International Symposium on High-Performance Computer Architecture (HPCA’19). IEEE, 120–132.
[13]
Hybrid Memory Cube Consortium. 2013. Hybrid Memory Cube Specification 1.0. Retrieved June 8, 2016 from http://hybridmemorycube.org/files/SiteDownloads/HMC_Specification%201_0.pd [Online; accessed 8-June-2016].
[14]
Ayse K. Coskun, Jose L. Ayala, David Atienza, Tajana Simunic Rosing, and Yusuf Leblebici. 2009. Dynamic thermal management in 3D multicore architectures. In Proceedings of the Design Automation and Test in Europe (DATE’09). 1410–1415.
[15]
Ayse Kivilcim Coskun, Tajana Simunic Rosing, and Kenny C. Gross. 2008. Proactive temperature management in MPSoCs. In Proceedings of the International Symposium on Low Power Electronics and Design (ISLPED’08). ACM, 165–170.
[16]
Marco Cox, Amit Kumar Singh, Akash Kumar, and Henk Corporaal. 2013. Thermal-aware mapping of streaming applications on 3D multi-processor systems. In Proceedings of the 11th IEEE Symposium on Embedded Systems for Real-time Multimedia (ESTIMedia’13). IEEE, 11–20.
[17]
David Cuesta, Jose Ayala, Jose Hidalgo, Massimo Poncino, Andrea Acquaviva, and Enrico Macii. 2010. Thermal-aware floorplanning exploration for 3D multi-core architectures. In Proceedings of the 20th Symposium on Great Lakes Symposium on VLSI (GLSVLSI’10). ACM, 99–102.
[18]
Kapil Dev, Sherief Reda, Indrani Paul, Wei Huang, and Wayne Burleson. 2016. Workload-aware power gating design and run-time management for massively parallel gpgpus. In Proceedings of the IEEE Computer Society Annual Symposium on VLSI (ISVLSI). IEEE, 242–247.
[19]
Chenchen Fu, Yingchao Zhao, Minming Li, and Chun Jason Xue. 2017. Maximizing common idle time on multicore processors with shared memory. IEEE Transactions on Very Large Scale Integration Systems (TVLSI) 25, 7 (2017), 2095–2108.
[20]
Mohammad Hossein Hajkazemi, Mohammad Khavari Tavana, Tinoosh Mohsenin, and Houman Homayoun. 2017. Heterogeneous HMC+DDRx memory management for performance-temperature tradeoffs. ACM J. Emerg. Technol. Comput. Syst. (JETC) 14, 1, Article 4 (Sept. 2017), 21 pages.
[21]
Fazal Hameed, Mohammad Abdullah Al Faruque, and Jörg Henkel. 2011. Dynamic thermal management in 3D multi-core architecture through run-time adaptation. In Proceedings of the Design Automation and Test in Europe (DATE’11). 1–6.
[22]
Jun-Han Han, Karina Torres-Castro, Robert E. West, Nathan Swami, and Mircea Stan. 2021. Thermal analysis of microfluidic cooling in processing-in-3D-stacked memory. In Proceedings of the International Conference on Thermal, Mechanical and Multi-Physics Simulation and Experiments in Microelectronics and Microsystems (EuroSimE’21). IEEE, 1–6.
[23]
Houman Homayoun, Aseem Gupta, Alex Veidenbaum, Fadi Kurdahi, Nikil Dutt, et al. 2010. RELOCATE: Register file local access pattern redistribution mechanism for power and thermal management in out-of-order embedded processor. In Proceedings of the International Conference on High-Performance Embedded Architectures and Compilers (HiPEAC’10). Springer, 216–231.
[24]
Joe Jeddeloh and Brent Keeth. 2012. Hybrid memory cube new DRAM architecture increases density and performance. In Proceedings of the Symposium on VLSI Technology (VLSIT’12). IEEE, 87–88.
[25]
Matthias Jung, Christian Weis, and Norbert Wehn. 2021. The dynamic random access memory challenge in embedded computing systems. In A Journey of Embedded and Cyber-Physical Systems. Springer, Cham, 19–36.
[26]
Heba Khdr, Thomas Ebi, Muhammad Shafique, and Hussam Amrouch. 2014. mDTM: Multi-objective dynamic thermal management for on-chip systems. In Proceedings of the Design Automation and Test in Europe (DATE’14). IEEE, 1–6.
[27]
Mustafa Korkmaz, Alexey Karyakin, Martin Karsten, and Kenneth Salem. 2015. Towards dynamic green-sizing for database servers. In Proceedings of the International Conference on Very Large Data Bases (VLDB’15). 25–36.
[28]
Etienne Le Sueur and Gernot Heiser. 2011. Slow down or sleep, that is the question. In Proceedings of the USENIX Annual Technical Conference ((USENIXATC’11, Portland, OR), USENIX Association, 16.
[29]
Dongjin Lee, Sourav Das, Janardhan Rao Doppa, Partha Pratim Pande, and Krishnendu Chakrabarty. 2018. Performance and thermal tradeoffs for energy-efficient monolithic 3D network-on-chip. ACM Trans. Des. Autom. Electr. Syst. (TODAES) 23, 5 (2018), 60.
[30]
Sukhan Lee, Hyunyoon Cho, Young Hoon Son, Yuhwan Ro, Nam Sung Kim, and Jung Ho Ahn. 2018. Leveraging power-performance relationship of energy-efficient modern DRAM devices. IEEE Access 6 (2018), 31387–31398.
[31]
Dawei Li, Kaicheng Zhang, Akhil Guliani, and Seda Ogrenci-Memik. 2017. Adaptive thermal management for 3D ICs with stacked DRAM caches. In Proceedings of the Design Automation Conference (DAC’17). IEEE, 1–6.
[32]
Chien-Hui Liao, Charles H.-P. Wen, and Krishnendu Chakrabarty. 2015. An online thermal-constrained task scheduler for 3D multi-core processors. In Proceedings of the Design, Automation & Test in Europe Conference & Exhibition (DATE’15). 351–356.
[33]
Ankur Limaye and Tosiron Adegbija. 2018. A workload characterization of the SPEC CPU2017 benchmark suite. In Proceedings of the IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS’18). 149–158.
[34]
Weichen Liu, Lei Yang, Weiwen Jiang, Liang Feng, Nan Guan, Wei Zhang, and Nikil Dutt. 2018. Thermal-aware task mapping on dynamically reconfigurable network-on-chip based multiprocessor system-on-chip. IEEE Trans. Comput. (TC’18). 67, 12 (2018), 1818–1834.
[35]
Wei-Hen Lo, Kai-zen Liang, and TingTing Hwang. 2016. Thermal-aware dynamic page allocation policy by future access patterns for hybrid memory cube (HMC). In Proceedings of the Design, Automation & Test in Europe Conference & Exhibition (DATE’16). 1084–1089.
[36]
Yanchao Lu, Bingsheng He, Xueyan Tang, and Minyi Guo. 2014. Synergy of dynamic frequency scaling and demotion on DRAM power management: Models and optimizations. IEEE Trans. Comput. (TC’14). 64, 8 (2014), 2367–2381.
[37]
Yanchao Lu, Donghong Wu, Bingsheng He, Xueyan Tang, Jianliang Xu, and Minyi Guo. 2016. Rank-aware dynamic migrations and adaptive demotions for DRAM power management. IEEE Trans. Comput. (TC’16). 65, 1 (Jan 2016), 187–202.
[38]
Jie Meng and Ayse K. Coskun. 2012. Analysis and runtime management of 3D systems with stacked DRAM for boosting energy efficiency. In Design Automation and Test in Europe (DATE). 611–616.
[40]
Micron Technology. 2010. MT41J256M4JP-15E Datasheet.
[41]
Sai Prashanth Muralidhara, Lavanya Subramanian, Onur Mutlu, Mahmut Kandemir, and Thomas Moscibroda. 2011. Reducing memory interference in multicore systems via application-aware memory channel partitioning. In Proceedings of the 44th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO’11). ACM, 374–385.
[42]
Chi-Sung Oh, Ki Chul Chun, Young-Yong Byun, Yong-Ki Kim, So-Young Kim, Yesin Ryu, Jaewon Park, Sinho Kim, Sanguhn Cha, Donghak Shin, et al. 2020. 22.1 A 1.1V 16GB 640GB/s HBM2E DRAM with a data-bus window-extension technique and a synergetic on-die ECC scheme. In Proceedings of the IEEE International Solid-State Circuits Conference (ISSCC’20). IEEE.
[44]
Alok Prakash, Hussam Amrouch, Muhammad Shafique, Tulika Mitra, and Jörg Henkel. 2016. Improving mobile gaming performance through cooperative CPU-GPU thermal management. In Proceedings of the Design Automation Conference (DAC’16). IEEE, 1–6.
[45]
Mohammad Sadrosadati, Seyed Borna Ehsani, Hajar Falahati, Rachata Ausavarungnirun, Arash Tavakkol, Mojtaba Abaee, Lois Orosa, Yaohua Wang, Hamid Sarbazi-Azad, and Onur Mutlu. 2019. ITAP: Idle-time-aware power management for GPU execution units. ACM Transactions on Architecture and Code Optimization (TACO) 16, 1 (2019), 1–26.
[46]
Lokesh Siddhu, Rajesh Kedia, and Preeti Ranjan Panda. 2020. Leakage-aware dynamic thermal management of 3D memories. ACM Trans. Des. Autom. Electr. Syst. (TODAES) 26, 2 (2020), 1–31.
[47]
Lokesh Siddhu, Rajesh Kedia, and Preeti Ranjan Panda. 2022. CoreMemDTM: Integrated processor core and 3D memory dynamic thermal management for improved performance. In Proceedings of the Design, Automation & Test in Europe Conference & Exhibition (DATE’22). IEEE, 1377–1382.
[48]
Lokesh Siddhu and Preeti Ranjan Panda. 2019. FastCool: Leakage aware dynamic thermal management of 3D memories. In Proceedings of the Design Automation and Test in Europe (DATE’19). IEEE, 272–275.
[49]
Lokesh Siddhu and Preeti Ranjan Panda. 2019. PredictNcool: Leakage aware thermal management for 3D memories using a lightweight temperature predictor. ACM Trans. Embed. Comput. Syst. (TECS) 18, 5s (2019), 64.
[50]
Gaurav Singla, Gurinderjit Kaur, Ali K. Unver, and Umit Y. Ogras. 2015. Predictive dynamic thermal and power management for heterogeneous mobile platforms. In Proceedings of the Design Automation and Test in Europe (DATE’15). EDA Consortium, 960–965.
[51]
Filippo Sironi, Martina Maggio, Riccardo Cattaneo, Giovanni F. Del Nero, Donatella Sciuto, and Marco D. Santambrogio. 2013. ThermOS: System support for dynamic thermal management of chip multi-processors. In Proceedings of the 22nd International Conference on Parallel Architectures and Compilation Techniques (PACT’13). IEEE Press, 41–50.
[53]
Ishan G. Thakkar, Sudeep Pasricha, et al. 2018. LIBRA: Thermal and process variation aware reliability management in photonic networks-on-chip. IEEE Transactions on Multi-Scale Computing Systems (TMSCS) 4, 4 (2018), 758–772.
[54]
Sotirios Xydis, Gianluca Palermo, and Cristina Silvano. 2013. Thermal-aware datapath merging for coarse-grained reconfigurable processors. In Proceedings of the Design, Automation & Test in Europe Conference & Exhibition (DATE’13). EDA Consortium, 1649–1654.
[55]
Yaoyao Ye, Zhehui Wang, Peng Yang, Jiang Xu, Xiaowen Wu, Xuan Wang, Mahdi Nikdast, Zhe Wang, and Luan H. K. Duong. 2014. System-level modeling and analysis of thermal effects in WDM-based optical networks-on-chip. IEEE Trans. Comput.-Aid. Des. Integr. Circ. Syst. (TCAD) 33, 11 (2014), 1718–1731.
[56]
Changho Yoon, Jae Hoon Shim, Byungin Moon, and Joonho Kong. 2018. 3D die-stacked DRAM thermal management via task allocation and core pipeline control. The Institute of Electronics, Information and Communication Engineers (IEICE) Electronics Express 15, 3 (2018), 20171253–20171253.
[57]
Marina Zapater, Jose L. Ayala, José M. Moya, Kalyan Vaidyanathan, Kenny Gross, and Ayse K. Coskun. 2013. Leakage and temperature aware server control for improving energy efficiency in data centers. In Proceedings of the Design Automation and Test in Europe (DATE’13). 266–269.
[58]
Runjie Zhang, Mircea R. Stan, and Kevin Skadron. 2015. HotSpot 6.0: Validation, Acceleration and Extension. Technical Report CS-2015-04. University of Virginia.
[59]
Jintao Zheng, Ning Wu, Lei Zhou, Yunfei Ye, and Ke Sun. 2016. DFSB-based thermal management scheme for 3D NoC-bus architectures. IEEE Transactions on Very Large Scale Integration Systems (TVLSI) 24, 3 (March 2016), 920–931.
[60]
Changyun Zhu, Zhenyu Gu, Li Shang, Robert P. Dick, and Russ Joseph. 2008. Three-dimensional chip-multiprocessor run-time thermal management. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems (TCAD) 27, 8 (2008), 1479–1492.

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Embedded Computing Systems
ACM Transactions on Embedded Computing Systems  Volume 22, Issue 6
November 2023
428 pages
ISSN:1539-9087
EISSN:1558-3465
DOI:10.1145/3632298
  • Editor:
  • Tulika Mitra
Issue’s Table of Contents

Publisher

Association for Computing Machinery

New York, NY, United States

Journal Family

Publication History

Published: 09 November 2023
Online AM: 05 October 2023
Accepted: 27 August 2023
Revised: 27 July 2023
Received: 31 March 2023
Published in TECS Volume 22, Issue 6

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. 3D memory
  2. Dynamic Thermal Management
  3. partial channel closure

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)194
  • Downloads (Last 6 weeks)11
Reflects downloads up to 28 Jan 2025

Other Metrics

Citations

Cited By

View all

View Options

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Full Text

View this article in Full Text.

Full Text

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media