research-article

Software-Defined Far Memory in Warehouse-Scale Computers

Authors:

Andres Lagar-Cavilla,

Suleiman Souhlal,

Radoslaw Burny,

Ashwin Chaugule,

Kamil Adam Yurtsever,

Parthasarathy RanganathanAuthors Info & Claims

ASPLOS '19: Proceedings of the Twenty-Fourth International Conference on Architectural Support for Programming Languages and Operating Systems

Pages 317 - 330

https://doi.org/10.1145/3297858.3304053

Published: 04 April 2019 Publication History

Abstract

Increasing memory demand and slowdown in technology scaling pose important challenges to total cost of ownership (TCO) of warehouse-scale computers (WSCs). One promising idea to reduce the memory TCO is to add a cheaper, but slower, "far memory" tier and use it to store infrequently accessed (or cold) data. However, introducing a far memory tier brings new challenges around dynamically responding to workload diversity and churn, minimizing stranding of capacity, and addressing brownfield (legacy) deployments. We present a novel software-defined approach to far memory that proactively compresses cold memory pages to effectively create a far memory tier in software. Our end-to-end system design encompasses new methods to define performance service-level objectives (SLOs), a mechanism to identify cold memory pages while meeting the SLO, and our implementation in the OS kernel and node agent. Additionally, we design learning-based autotuning to periodically adapt our design to fleet-wide changes without a human in the loop. Our system has been successfully deployed across Google's WSC since 2016, serving thousands of production services. Our software-defined far memory is significantly cheaper (67% or higher memory cost reduction) at relatively good access speeds (6us) and allows us to store a significant fraction of infrequently accessed data (on average, 20%), translating to significant TCO savings at warehouse scale.

References

[1]

Advanced Micro Devices Inc. 2018. AMD64 Architecture Programmer's Manual Volume 2: System Programming. https://support.amd.com/TechDocs/24593.pdf Retrieved July 30, 2018 from

[2]

Neha Agarwal and Thomas F. Wenisch. 2017. Thermostat: Application-transparent page management for two-tiered main memory. In Proceedings of the International Conference on Architectural Support for Programming Languages and Operating Systems .

Digital Library

[3]

Marcos K. Aguilera, Nadav Amit, Irina Calciu, Xavier Deguillard, Jayneel Gandhi, Pratap Subrahmanyam, Lalith Suresh, Kiran Tati, Rajesh Venkatasubramanian, and Michael Wei. 2017. Remote memory in the age of fast networks. In Proceedings of the Symposium on Cloud Computing .

Digital Library

[4]

Luiz André Barroso, Urs Hölzle, and Parthasarathy Ranganathan. 2018. The Datacenter as a Computer: Designing Warehouse-Scale Machines .Morgan & Claypool Publishers.

Digital Library

[5]

Betsy Beyer, Chris Jones, Jennifer Petoff, and Niall Richard Murphy. 2016. Site Reliability Engineering: How Google Runs Production Systems .O'Reilly Media.

Digital Library

[6]

Craig Chambers, Ashish Raniwala, Frances Perry, Stephen Adams, Robert R. Henry, Robert Bradshaw, and Nathan Weizenbaum. 2010. FlumeJava: Easy, efficient data-parallel pipelines. In Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation .

Digital Library

[7]

Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson Hsieh, Deborah Wallach, Mike Burrows, Tushar Chandra, Andrew Fikes, and Robert Gruber. 2006. Bigtable: A distributed storage system for structured data. In Proceedings of the Symposium on Operating Systems Design and Implementation .

Digital Library

[8]

Jeffrey Dean and Sanjay Ghemawat. 2004. MapReduce: Simplified data processing on large clusters. In Proceedings of the Symposium on Operating System Design and Implementation .

Digital Library

[9]

Subramanya R. Dulloor, Sanjay Kumar, Anil Keshavamurthy, Philip Lantz, Dheeraj Reddy, Rajesh Sankaran, and Jeff Jackson. 2014. System software for persistent memory. In Proceedings of the European Conference on Computer Systems .

Digital Library

[10]

Subramanya R. Dulloor, Amitabha Roy, Zheguang Zhao, Narayanan Sundaram, Nadathur Satish, Rajesh Sankaran, Jeff Jackson, and Karsten Schwan. 2016. Data tiering in heterogeneous memory systems. In Proceedings of the European Conference on Computer Systems .

Digital Library

[11]

Assaf Eisenman, Darryl Gardner, Islam AbdelRahman, Jens Axboe, Siying Dong, Kim Hazelwood, Chris Petersen, Asaf Cidon, and Sachin Katti. 2018. Reducing DRAM footprint with NVM in Facebook. In Proceedings of the European Conference on Computer Systems .

Digital Library

[12]

Magnus Ekman and Per Stenstrom. 2004. A case for multi-level main memory. In Proceedings of the Workshop on Memory Performance Issues .

Digital Library

[13]

Adam Engst. 1996. RAM Doubler 2. https://tidbits.com/1996/10/28/ram-doubler-2/ Retrieved October 17, 2018 from

[14]

Daniel Golovin, Benjamin Solnik, Subhodeep Moitra, Greg Kochanski, John Elliot Karro, and D. Sculley. 2017. Google Vizier: A service for black-box optimization. In Proceedings of the International Conference on Knowledge Discovery and Data Mining .

Digital Library

[15]

Juncheng Gu, Youngmoon Lee, Yiwen Zhang, Mosharaf Chowdhury, and Kang Shin. 2017. Efficient memory disaggregation with Infiniswap. In Proceedings of the USENIX Symposium on Networked Systems Design and Implementation .

Digital Library

[16]

Intel Corporation. 2016. Intel® 64 and IA-32 Architectures Software Developer's Manual. https://www.intel.com/content/www/us/en/architecture-and-technology/64-ia-32-architectures-software-developer-manual-325462.html Retrieved July 30, 2018 from

[17]

Intel Corporation. 2018. Intel Newsroom. Reimagining the Data Center Memory and Storage Hierarchy. https://newsroom.intel.com/editorials/re-architecting-data-center-memory-storage-hierarchy/ Retrieved July 30, 2018 from

[18]

Hugo Larochelle Jasper Snoek and Ryan P Adams. 2012. Practical Bayesian optimization of machine learning algorithms. In Advances in Neural Information Processing Systems .

Digital Library

[19]

Youngbin Jin, Shihab Mustafa, and Myoungsoo Jung. 2014. Area, power, and latency considerations of STT-MRAM to substitute for main memory. In Proceedings of the Memory Forum .

[20]

Ju-Yong Jung and Sangyeun Cho. 2013. Memorage: Emerging persistent RAM based malleable main memory and storage architecture. In Proceedings of the International Conference on Supercomputing .

Digital Library

[21]

Svilen Kanev, Juan Pablo Darago, Kim Hazelwood, Parthasarathy Ranganathan, Tipp Moseley, Gu-Yeon Wei, and David Brooks. 2015. Profiling a Warehouse-scale Computer. In Proceedings of the International Symposium on Computer Architecture .

Digital Library

[22]

Uksong Kang, Hak-Soo Yu, Churoo Park, Hongzhong Zheng, John Halbert, Kuljit Bains, S. Jang, and Joo Sun Choi. 2014. Co-architecting controllers and DRAM to enhance DRAM process scaling. Presented at the Memory Forum.

[23]

Benjamin C. Lee, Engin Ipek, Onur Mutlu, and Doug Burger. 2009. Architecting phase-change memory as a scalable DRAM alternative. In Proceedings of the International Symposium on Computer Architecture .

Digital Library

[24]

Seok-Hee Lee. 2016. Technology scaling challenges and opportunities of memory devices. In Proceedings of the International Electron Devices Meeting .

[25]

Michel Lespinasse. 2011. Idle page tracking / working set estimation. https://lwn.net/Articles/460762/ Retrieved July 31, 2018 from

[26]

Shuang Liang, Ranjit Noronha, and Dhabaleswar K. Panda. 2005. Swapping to remote memory over InfiniBand: An approach using a high performance network block device. In Proceedings of the International Conference on Cluster Computing .

[27]

Kevin Lim, Jichuan Chang, Trevor Mudge, Parthasarathy Ranganathan, Steven K. Reinhardt, and Thomas F. Wenisch. 2009. Disaggregated memory for expansion and sharing in blade servers. In Proceedings of the International Symposium on Computer Architecture .

Digital Library

[28]

Kevin Lim, Yoshio Turner, Jose Renato Santos, Alvin AuYoung, Jichuan Chang, Parthasarathy Ranganathan, and Thomas F. Wenisch. 2012. System-level implications of disaggregated memory. In Proceedings of the International Symposium on High-Performance Computer Architecture .

Digital Library

[29]

Allyn Malventano. 2018. Intel's Optane DC Persistent Memory DIMMs Push Latency Closer to DRAM. https://www.pcper.com/news/Storage/Intels-Optane-DC-Persistent-Memory-DIMMs-Push-Latency-Closer-DRAM Retrieved December 15, 2018 from

[30]

Tom Nelson. 2018. Understanding Compressed Memory on the Mac. https://www.lifewire.com/understanding-compressed-memory-os-x-2260327 Retrieved October 17, 2018 from

[31]

Moinuddin K. Qureshi, Vijayalakshmi Srinivasan, and Jude A. Rivers. 2009. Scalable high performance main memory system using phase-change memory technology. In Proceedings of the International Symposium on Computer Architecture .

Digital Library

[32]

Parthasarathy Ranganathan. 2017. More Moore: Thinking outside the (server) box. Keynote at the International Symposium on Computer Architecture.

[33]

Charles Reiss, Alexey Tumanov, Gregory R. Ganger, Randy H. Katz, and Michael A. Kozuch. 2012. Heterogeneity and dynamicity of clouds at scale: Google trace analysis. In Proceedings of the ACM Symposium on Cloud Computing .

Digital Library

[34]

Arthur Sainio. 2016. NVDIMM -- Changes are here so what's next? Presented at the In-Memory Computing Summit.

[35]

Samsung Electronics. 2017. Ultra-Low Latency with Samsung Z-NAND SSD. https://www.samsung.com/us/labs/pdfs/collateral/Samsung_Z-NAND_Technology_Brief_v5.pdf Retrieved July 31, 2018 from

[36]

Niranjan Srinivas, Andreas Krause, Sham Kakade, and Matthias Seeger. 2010. Gaussian process optimization in the bandit setting: No regret and experimental design. In Proceedings of the International Conference on Machine Learning .

Digital Library

[37]

Abhishek Verma, Luis Pedrosa, Madhukar Korupolu, David Oppenheimer, Eric Tune, and John Wilkes. 2015. Large-scale cluster management at Google with Borg. In Proceedings of the European Conference on Computer Systems .

Digital Library

[38]

Haris Volos, Andres Jaan Tack, and Michael M. Swift. 2011. Mnemosyne: Lightweight persistent memory. In Proceedings of the International Conference on Architectural Support for Programming Languages and Operating Systems .

Digital Library

[39]

Carl A. Waldspurger. 2002. Memory resource management in VMware ESX server. In Proceedings of the Symposium on Operating Systems Design and Implementation .

Digital Library

[40]

Paul R. Wilson, Scott F. Kaplan, and Yannis Smaragdakis. 1999. The case for compressed caching in virtual memory systems. In Proceedings of the USENIX Annual Technical Conference .

Digital Library

[41]

Dongliang Xue, Chao Li, Linpeng Huang, Chentao Wu, and Tianyou Li. 2018. Adaptive memory fusion: Towards transparent, agile integration of persistent memory. In Proceedings of the International Symposium on High Performance Computer Architecture .

[42]

Xiao Zhang, Eric Tune, Robert Hagmann, Rohit Jnagal, Vrigo Gokhale, and John Wilkes. 2013. CPItextsuperscript2: CPU performance isolation for shared compute clusters. In Proceedings of the European Conference on Computer Systems .

Digital Library

[43]

Pin Zhou, Vivek Pandey, Jagadeesan Sundaresan, Anand Raghuraman, Yuanyuan Zhou, and Sanjeev Kumar. 2004. Dynamic tracking of page miss ratio curve for memory management. In Proceedings of the International Conference on Architectural Support for Programming Languages and Operating Systems .

Digital Library

Cited By

Branchini BDi Dio Lavore ICastellana VSantambrogio M(2024)Programming the Future: the Essential Role of System Topology Awareness in Heterogeneous Disaggregated EnvironmentsProceedings of the International Symposium on Memory Systems10.1145/3695794.3695811(186-191)Online publication date: 30-Sep-2024
https://dl.acm.org/doi/10.1145/3695794.3695811
Xie RMa LZhong AChen FZhang T(2024)ZipCache: A DRAM/SSD Cache with Built-in Transparent CompressionProceedings of the International Symposium on Memory Systems10.1145/3695794.3695805(116-128)Online publication date: 30-Sep-2024
https://dl.acm.org/doi/10.1145/3695794.3695805
Huang JZhang MMa TLiu ZLin SChen KJiang JLiao XShan YZhang NLu MMa TGong HWu YWitchel EArpaci-Dusseau ARossbach CKeeton K(2024)TrEnv: Transparently Share Serverless Execution Environments Across Different Functions and NodesProceedings of the ACM SIGOPS 30th Symposium on Operating Systems Principles10.1145/3694715.3695967(421-437)Online publication date: 4-Nov-2024
https://dl.acm.org/doi/10.1145/3694715.3695967
Show More Cited By

Index Terms

Software-Defined Far Memory in Warehouse-Scale Computers
1. Computer systems organization
  1. Architectures
    1. Distributed architectures
2. Software and its engineering
  1. Software organization and properties
    1. Contextual software domains
      1. Operating systems
        Memory management

Recommendations

RapidSwap: a Hierarchical Far Memory
Economics of Grids, Clouds, Systems, and Services
Abstract
As more and more memory-intensive applications are moved into the cloud, data center operators face the challenge of providing sufficient main memory resources while achieving high resource utilization. Solutions to overcome the unsatisfying ...
Asynchronous Memory Access Unit: Exploiting Massive Parallelism for Far Memory Access
The growing memory demands of modern applications have driven the adoption of far memory technologies in data centers to provide cost-effective, high-capacity memory solutions. However, far memory presents new performance challenges because its access ...
XFM: Accelerated Software-Defined Far Memory
MICRO '23: Proceedings of the 56th Annual IEEE/ACM International Symposium on Microarchitecture

DRAM constitutes over 50% of server cost and 75% of the embodied carbon footprint of a server. To mitigate DRAM cost, far memory architectures have emerged. They can be separated into two broad categories: software-defined far memory (SFM) and ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

ASPLOS '19: Proceedings of the Twenty-Fourth International Conference on Architectural Support for Programming Languages and Operating Systems

April 2019

1126 pages

ISBN:9781450362405

DOI:10.1145/3297858

General Chairs:
Iris Bahar
Brown University
,
Maurice Herlihy
Brown University
,
Program Chairs:
Emmett Witchel
University of Texas, Austin
,
Alvin Lebeck
Duke University

Copyright © 2019 Owner/Author.

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike International 4.0 License.

Sponsors

In-Cooperation

SIGBED: ACM Special Interest Group on Embedded Systems

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 04 April 2019

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

ASPLOS '19

Sponsor:

ASPLOS '19: Architectural Support for Programming Languages and Operating Systems

April 13 - 17, 2019

RI, Providence, USA

Acceptance Rates

ASPLOS '19 Paper Acceptance Rate 74 of 351 submissions, 21%;

Overall Acceptance Rate 535 of 2,713 submissions, 20%

Upcoming Conference

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

79
Total Citations
View Citations
5,931
Total Downloads

Downloads (Last 12 months)448
Downloads (Last 6 weeks)40

Reflects downloads up to 26 Dec 2024

Other Metrics

View Author Metrics

Citations

Cited By

Branchini BDi Dio Lavore ICastellana VSantambrogio M(2024)Programming the Future: the Essential Role of System Topology Awareness in Heterogeneous Disaggregated EnvironmentsProceedings of the International Symposium on Memory Systems10.1145/3695794.3695811(186-191)Online publication date: 30-Sep-2024
https://dl.acm.org/doi/10.1145/3695794.3695811
Xie RMa LZhong AChen FZhang T(2024)ZipCache: A DRAM/SSD Cache with Built-in Transparent CompressionProceedings of the International Symposium on Memory Systems10.1145/3695794.3695805(116-128)Online publication date: 30-Sep-2024
https://dl.acm.org/doi/10.1145/3695794.3695805
Huang JZhang MMa TLiu ZLin SChen KJiang JLiao XShan YZhang NLu MMa TGong HWu YWitchel EArpaci-Dusseau ARossbach CKeeton K(2024)TrEnv: Transparently Share Serverless Execution Environments Across Different Functions and NodesProceedings of the ACM SIGOPS 30th Symposium on Operating Systems Principles10.1145/3694715.3695967(421-437)Online publication date: 4-Nov-2024
https://dl.acm.org/doi/10.1145/3694715.3695967
Hao XZhou XYu XStonebraker M(2024)Towards Buffer Management with Tiered Main MemoryProceedings of the ACM on Management of Data10.1145/36392862:1(1-26)Online publication date: 26-Mar-2024
https://dl.acm.org/doi/10.1145/3639286
Ren JXu DRyu JShin KKim DLi D(2024)MTM: Rethinking Memory Profiling and Migration for Multi-Tiered Large MemoryProceedings of the Nineteenth European Conference on Computer Systems10.1145/3627703.3650075(803-817)Online publication date: 22-Apr-2024
https://dl.acm.org/doi/10.1145/3627703.3650075
Chang JDoh WMoon YLee EAhn JMencagli GDazzi PLowenthal DBadia R(2024)IDT: Intelligent Data Placement for Multi-tiered Main Memory with Reinforcement LearningProceedings of the 33rd International Symposium on High-Performance Parallel and Distributed Computing10.1145/3625549.3658659(69-82)Online publication date: 3-Jun-2024
https://dl.acm.org/doi/10.1145/3625549.3658659
Tauro BSuchy BCampanoni SDinda PHale KTsafrir DMUSUVATHI MGupta RAbu-Ghazaleh N(2024)TrackFM: Far-out Compiler Support for a Far Memory WorldProceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 110.1145/3617232.3624856(401-419)Online publication date: 27-Apr-2024
https://dl.acm.org/doi/10.1145/3617232.3624856
Piga LNarayanan ISundarrajan ASkach MDeng QMaity BChakkaravarthy MHuang ADhanotia AMalani PTsafrir DMUSUVATHI MGupta RAbu-Ghazaleh N(2024)Expanding Datacenter Capacity with DVFS Boosting: A safe and scalable deployment experienceProceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 110.1145/3617232.3624853(150-165)Online publication date: 27-Apr-2024
https://dl.acm.org/doi/10.1145/3617232.3624853
Olivier PMehrab AErrabelly SLankes SKaraoui MLyerly RKim SBarbalace ARavindran B(2024)HEXO: Offloading Long-Running Compute- and Memory-Intensive Workloads on Low-Cost, Low-Power Embedded SystemsIEEE Transactions on Cloud Computing10.1109/TCC.2024.348217812:4(1415-1432)Online publication date: Oct-2024
https://doi.org/10.1109/TCC.2024.3482178
Huang WZhou JWang MZhou YZhang XZhu FLi SWang KWu F(2024)TieredHM: Hotspot-Optimized Hash Indexing for Memory-Semantic SSD-Based Hybrid MemoryIEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems10.1109/TCAD.2024.335469343:6(1755-1768)Online publication date: Jun-2024
https://doi.org/10.1109/TCAD.2024.3354693
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents