research-article

Hardware-Software Co-design to Mitigate DRAM Refresh Overheads: A Case for Refresh-Aware Process Scheduling

Authors:

Jagadish B. Kotra,

Narges Shahidi,

Zeshan A. Chishti,

Mahmut T. KandemirAuthors Info & Claims

ACM SIGPLAN Notices, Volume 52, Issue 4

Pages 723 - 736

https://doi.org/10.1145/3093336.3037724

Published: 04 April 2017 Publication History

Abstract

DRAM cells need periodic refresh to maintain data integrity. With high capacity DRAMs, DRAM refresh poses a significant performance bottleneck as the number of rows to be refreshed (and hence the refresh cycle time, tRFC) with each refresh command increases. Modern day DRAMs perform refresh at a rank-level, while LPDDRs used in mobile environments support refresh at a per-bank level. Rank-level refresh degrades the performance significantly since none of the banks in a rank can serve the on-demand requests. Per-bank refresh alleviates some of the performance bottlenecks as the other banks in a rank are available for on-demand requests. Typical DRAM retention time is in the order several of milliseconds, viz, 64msec for environments operating in temperatures below 85 deg C and 32msec for environments operating above 85 deg C.

With systems moving towards increased consolidation (ex: virtualized environments), DRAM refresh becomes a significant bottleneck as it reduces the available overall DRAM bandwidth per task. In this work, we propose a hardware-software co-design to mitigate DRAM refresh overheads by exposing the hardware address mapping and DRAM refresh schedule to the Operating System. We propose a novel DRAM refresh-aware process scheduling algorithm in OS which schedules applications on cores such that none of the on-demand requests from the application are stalled by refreshes. Extensive evaluation of our proposed co-design on multi-programmed SPEC CPU2006 workloads show significant performance improvement compared to the previously proposed hardware only approaches.

References

[1]

Linux cgroups. http://goo.gl/tTiwSl.

[2]

Linux debugfs. https://goo.gl/sdBhIh.

[3]

linuxcfsLinux CFS Scheduler. https://goo.gl/hjVjJl,natexlaba.

[4]

linuxkernelbookUnderstanding the Linux Kernel. http://goo.gl/8P7gJR,natexlabb.

[5]

NAS. https://www.nas.nasa.gov/publications/npb.html.

[6]

SPEC 2006. https://www.spec.org/cpu2006/.

[7]

STREAM. https://www.cs.virginia.edu/stream/.

[8]

ddr3JEDEC. DDR3 SDRAM Standard, 2012\natexlaba.

[9]

ddr4JEDEC. DDR4 SDRAM Standard, 2012\natexlabb.

[10]

JEDEC. Low Power Double Data Rate 3 (LPDDR3), 2012.

[11]

I. Bhati, Z. Chishti, and B. Jacob. Coordinated refresh: Energy efficient techniques for DRAM refresh scheduling. In Proceedings of the 2013 International Symposium on Low Power Electronics and Design, ISLPED, 2013.

[12]

I. Bhati, Z. Chishti, S.-L. Lu, and B. Jacob. Flexible auto-refresh: Enabling scalable and energy-efficient DRAM refresh reductions. In Proceedings of the 42nd Annual International Symposium on Computer Architecture, ISCA, 2015.

Digital Library

[13]

N. Binkert, B. Beckmann, G. Black, S. K. Reinhardt, A. Saidi, A. Basu, J. Hestness, D. R. Hower, T. Krishna, S. Sardashti, R. Sen, K. Sewell, M. Shoaib, N. Vaish, M. D. Hill, and D. A. Wood. The gem5 simulator. SIGARCH Comput. Archit. News, 2011.

[14]

J. D. Booth, J. B. Kotra, H. Zhao, M. Kandemir, and P. Raghavan. Phase detection with hidden markov models for dvfs on many-core processors. In 2015 IEEE 35th International Conference on Distributed Computing Systems, ICDCS, 2015.

[15]

K. K. W. Chang, D. Lee, Z. Chishti, A. R. Alameldeen, C. Wilkerson, Y. Kim, and O. Mutlu. Improving DRAM performance by parallelizing refreshes with accesses. In the 20th International Symposium on High Performance Computer Architecture, HPCA, 2014.

[16]

N. Chatterjee, N. Muralimanohar, R. Balasubramonian, A. Davis, and N. P. Jouppi. Staged reads: Mitigating the impact of DRAM writes on DRAM reads. In Proceedings of the 2012 IEEE 18th International Symposium on High-Performance Computer Architecture, HPCA, 2012.

Digital Library

[17]

V. V. Fedorov, A. L. N. Reddy, and P. V. Gratz. Shared last-level caches and the case for longer timeslices. In Proceedings of the 2015 International Symposium on Memory Systems, MEMSYS, 2015.

Digital Library

[18]

M. K. Jeong, D. H. Yoon, D. Sunwoo, M. Sullivan, I. Lee, and M. Erez. Balancing DRAM locality and parallelism in shared memory CMP systems. In IEEE International Symposium on High-Performance Comp Architecture, HPCA, 2012.

Digital Library

[19]

D. Kaseridis, J. Stuecheli, and L. K. John. Minimalist open-page: A DRAM page-mode scheduling policy for the many-core era. In Proceedings of the 44th Annual IEEE/ACM International Symposium on Microarchitecture, MICRO, 2011.

Digital Library

[20]

O. Kislal, M. T. Kandemir, and J. B. Kotra. Cache-aware approximate computing for decision tree learning. In 2016 IEEE International Parallel and Distributed Processing Symposium Workshops, IPDPSW, 2016.

[21]

J. B. Kotra, M. Arjomand, D. Guttman, M. T. Kandemir, and C. R. Das. Re-NUCA: A practical nuca architecture for reram based last-level caches. In 2016 IEEE International Parallel and Distributed Processing Symposium, IPDPS, 2016.

[22]

Liu, Jaiyen, Veras, and Mutlu]raidrJ. Liu, B. Jaiyen, R. Veras, and O. Mutlu. Raidr: Retention-aware intelligent DRAM refresh. In Proceedings of the 39th Annual International Symposium on Computer Architecture, ISCA, 2012\natexlaba.

[23]

J. Liu, J. B. Kotra, W. Ding, and M. Kandemir. Network footprint reduction through data access and computation placement in noc-based manycores. In Proceedings of the 52Nd Annual Design Automation Conference, DAC, 2015.

Digital Library

[24]

Liu, Cui, Xing, Bao, Chen, and Wu]LiupactL. Liu, Z. Cui, M. Xing, Y. Bao, M. Chen, and C. Wu. A software memory partition approach for eliminating bank-level interference in multicore systems. In Proceedings of the 21st International Conference on Parallel Architectures and Compilation Techniques, PACT, 2012\natexlabb.

[25]

S. Liu, K. Pattabiraman, T. Moscibroda, and B. G. Zorn. Flikker: Saving DRAM refresh-power through critical data partitioning. In Proceedings of the Sixteenth International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS, 2011.

Digital Library

[26]

a, and Fedorova]linuxschedeurosysJ. Lozi, B. Lepers, J. R. Funston, F. Gaud, V. Quéma, and A. Fedorova. The Linux scheduler: a decade of wasted cores. In Proceedings of the Eleventh European Conference on Computer Systems, EuroSys, 2016.

Digital Library

[27]

z]jananiiscaJ. Mukundan, H. Hunter, K.-h. Kim, J. Stuecheli, and J. F. Martínez. Understanding and mitigating refresh overheads in high-density DDR4 DRAM systems. In Proceedings of the 40th Annual International Symposium on Computer Architecture, ISCA, 2013.

Digital Library

[28]

P. Nair, C. C. Chou, and M. K. Qureshi. A case for refresh pausing in DRAM memory systems. In IEEE 19th International Symposium on High Performance Computer Architecture, HPCA, 2013.

Digital Library

[29]

M. Poremba and Y. Xie. Nvmain: An architectural-level main memory simulator for emerging non-volatile memories. In IEEE Computer Society Annual Symposium on VLSI, ISVLSI, 2012.

Digital Library

[30]

M. K. Qureshi, D. H. Kim, S. Khan, P. J. Nair, and O. Mutlu. Avatar: A variable-retention-time (vrt) aware refresh for DRAM systems. In IEEE/IFIP International Conference on Dependable Systems and Networks, DSN, 2015.

Digital Library

[31]

S. Rixner, W. J. Dally, U. J. Kapasi, P. Mattson, and J. D. Owens. Memory access scheduling. In Proceedings of the 27th Annual International Symposium on Computer Architecture, ISCA, 2000.

Digital Library

[32]

J. Stuecheli, D. Kaseridis, H. C. Hunter, and L. K. John. Elastic refresh: Techniques to mitigate refresh penalties in high density memory. In the 43rd Annual International Symposium on Microarchitecture, MICRO, 2010.

Digital Library

[33]

K. Swaminathan, J. B. Kotra, H. Liu, J. Sampson, M. Kandemir, and V. Narayanan. Thermal-aware application scheduling on device-heterogeneous embedded architectures. 2015 28th International Conference on VLSI Design, 2015.

[34]

X. Tang, M. Kandemir, P. Yedlapalli, and J. B. Kotra. Improving bank-level parallelism for irregular applications. In 2016 49th Annual IEEE/ACM International Symposium on Microarchitecture, MICRO, 2016.

[35]

R. K. Venkatesan, S. Herr, and E. Rotenberg. Retention-aware placement in DRAM (rapid): software methods for quasi-non-volatile DRAM. In The Twelfth International Symposium on High-Performance Computer Architecture, HPCA, 2006.

[36]

P. Yedlapalli, J. B. Kotra, E. Kultursay, M. Kandemir, C. R. Das, and A. Sivasubramaniam. Meeting midway: Improving CMP performance with memory-side prefetching. In Proceedings of the 22nd International Conference on Parallel Architectures and Compilation Techniques, PACT, 2013.

[37]

H. Yun, R. Mancuso, Z. P. Wu, and R. Pellizzoni. PALLOC: DRAM bank-aware memory allocator for performance isolation on multicore platforms. In 2014 IEEE 19th Real-Time and Embedded Technology and Applications Symposium, RTAS, 2014.

[38]

T. Zhang, M. Poremba, C. Xu, G. Sun, and Y. Xie. Cream: A concurrent-refresh-aware DRAM memory architecture. In The 20th International Symposium on High Performance Computer Architecture, HPCA, 2014.

Cited By

Golman RGiterman RTeman A(2024)Multi-Ported GC-eDRAM Bitcell with Dynamic Port Configuration and Refresh MechanismJournal of Low Power Electronics and Applications10.3390/jlpea1401000214:1(2)Online publication date: 4-Jan-2024
https://doi.org/10.3390/jlpea14010002
Nguyen DHo NChang I(2021)SoftRefresh: Targeted refresh for Energy-efficient DRAM systems via Software and Operating Systems supportProceedings of the International Symposium on Memory Systems10.1145/3488423.3519323(1-6)Online publication date: 27-Sep-2021
https://dl.acm.org/doi/10.1145/3488423.3519323
Balasubramonian R(2019)Innovations in the Memory SystemSynthesis Lectures on Computer Architecture10.2200/S00933ED1V01Y201906CAC04814:2(1-151)Online publication date: 10-Sep-2019
https://doi.org/10.2200/S00933ED1V01Y201906CAC048
Show More Cited By

Recommendations

Hardware-Software Co-design to Mitigate DRAM Refresh Overheads: A Case for Refresh-Aware Process Scheduling
Asplos'17

DRAM cells need periodic refresh to maintain data integrity. With high capacity DRAMs, DRAM refresh poses a significant performance bottleneck as the number of rows to be refreshed (and hence the refresh cycle time, tRFC) with each refresh command ...
Hardware-Software Co-design to Mitigate DRAM Refresh Overheads: A Case for Refresh-Aware Process Scheduling
ASPLOS '17: Proceedings of the Twenty-Second International Conference on Architectural Support for Programming Languages and Operating Systems

DRAM cells need periodic refresh to maintain data integrity. With high capacity DRAMs, DRAM refresh poses a significant performance bottleneck as the number of rows to be refreshed (and hence the refresh cycle time, tRFC) with each refresh command ...
Per-bank refresh with adaptive early termination for high density DRAM
ICCIP '18: Proceedings of the 4th International Conference on Communication and Information Processing

DRAM, which is mainly used as main memory, requires a refresh operation to maintain the integrity of stored data. Since memory read and write operations to a bank are not allowed while the bank is being refreshed, a lot of memory accesses may be blocked ...

Comments

Information & Contributors

Information

Published In

cover image ACM SIGPLAN Notices

ACM SIGPLAN Notices Volume 52, Issue 4

ASPLOS '17

April 2017

811 pages

ISSN:0362-1340

EISSN:1558-1160

DOI:10.1145/3093336

Editor:
Matthew Fluet

Issue’s Table of Contents

ASPLOS '17: Proceedings of the Twenty-Second International Conference on Architectural Support for Programming Languages and Operating Systems
April 2017
856 pages
ISBN:9781450344654
DOI:10.1145/3037697
General Chairs:
Yunji Chen
Institute of Computing Technology, CAS, China
,
Olivier Temam
Google, USA
,
Program Chair:
John Carter
IBM, USA

Copyright © 2017 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 04 April 2017

Published in SIGPLAN Volume 52, Issue 4

Check for updates

Author Tags

Qualifiers

Research-article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

25
Total Citations
View Citations
689
Total Downloads

Downloads (Last 12 months)33
Downloads (Last 6 weeks)4

Reflects downloads up to 07 Nov 2024

Other Metrics

View Author Metrics

Citations

Cited By

Golman RGiterman RTeman A(2024)Multi-Ported GC-eDRAM Bitcell with Dynamic Port Configuration and Refresh MechanismJournal of Low Power Electronics and Applications10.3390/jlpea1401000214:1(2)Online publication date: 4-Jan-2024
https://doi.org/10.3390/jlpea14010002
Nguyen DHo NChang I(2021)SoftRefresh: Targeted refresh for Energy-efficient DRAM systems via Software and Operating Systems supportProceedings of the International Symposium on Memory Systems10.1145/3488423.3519323(1-6)Online publication date: 27-Sep-2021
https://dl.acm.org/doi/10.1145/3488423.3519323
Balasubramonian R(2019)Innovations in the Memory SystemSynthesis Lectures on Computer Architecture10.2200/S00933ED1V01Y201906CAC04814:2(1-151)Online publication date: 10-Sep-2019
https://doi.org/10.2200/S00933ED1V01Y201906CAC048
Pan XMueller F(2019)The Colored Refresh Server for DRAM2019 IEEE Real-Time Systems Symposium (RTSS)10.1109/RTSS46320.2019.00023(146-153)Online publication date: Dec-2019
https://doi.org/10.1109/RTSS46320.2019.00023
Pan XMueller F(2019)The Colored Refresh Server for DRAM2019 IEEE 22nd International Symposium on Real-Time Distributed Computing (ISORC)10.1109/ISORC.2019.00015(27-34)Online publication date: May-2019
https://doi.org/10.1109/ISORC.2019.00015
Ding XLiang XLi Y(2017)Cross-layer refresh mitigation for efficient and reliable DRAM systems: A comparative study2017 IEEE International Test Conference (ITC)10.1109/TEST.2017.8242065(1-10)Online publication date: Oct-2017
https://doi.org/10.1109/TEST.2017.8242065
Yeleswarapu RSomani A(2021)Addressing multiple bit/symbol errors in DRAM subsystemPeerJ Computer Science10.7717/peerj-cs.3597(e359)Online publication date: 9-Feb-2021
https://doi.org/10.7717/peerj-cs.359
Kim SKwak WKim CBaek DHuh J(2020)Charge-Aware DRAM Refresh Reduction with Value Transformation2020 IEEE International Symposium on High Performance Computer Architecture (HPCA)10.1109/HPCA47549.2020.00060(663-676)Online publication date: Feb-2020
https://doi.org/10.1109/HPCA47549.2020.00060
Liu CKotra JJung MKandemir MAgrawal NRangaswami R(2018)PENProceedings of the 16th USENIX Conference on File and Storage Technologies10.5555/3189759.3189766(67-82)Online publication date: 12-Feb-2018
https://dl.acm.org/doi/10.5555/3189759.3189766
Kislal OKotra JTang XKandemir MJung M(2018)Enhancing computation-to-core assignment with physical location informationACM SIGPLAN Notices10.1145/3296979.319238653:4(312-327)Online publication date: 11-Jun-2018
https://dl.acm.org/doi/10.1145/3296979.3192386
Show More Cited By

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Issue’s Table of Contents