skip to main content
10.1145/3195970.3196071acmconferencesArticle/Chapter ViewAbstractPublication PagesdacConference Proceedingsconference-collections
research-article

Long live TIME: improving lifetime for training-in-memory engines by structured gradient sparsification

Published: 24 June 2018 Publication History

Abstract

Deeper and larger Neural Networks (NNs) have made breakthroughs in many fields. While conventional CMOS-based computing platforms are hard to achieve higher energy efficiency. RRAM-based systems provide a promising solution to build efficient Training-In-Memory Engines (TIME). While the endurance of RRAM cells is limited, it's a severe issue as the weights of NN always need to be updated for thousands to millions of times during training. Gradient sparsification can address this problem by dropping off most of the smaller gradients but introduce unacceptable computation cost. We proposed an effective framework, SGS-ARS, including Structured Gradient Sparsification (SGS) and Aging-aware Row Swapping (ARS) scheme, to guarantee write balance across whole RRAM crossbars and prolong the lifetime of TIME. Our experiments demonstrate that 356× lifetime extension is achieved when TIME is programmed to train ResNet-50 on Imagenet dataset with our SGS-ARS framework.

References

[1]
Aji et al. 2017. Sparse Communication for Distributed Gradient Descent. In Empirical Methods in Natural Language Processing (EMNLP).
[2]
Karsten Beckmann et al. 2016. Nanoscale Hafnium Oxide RRAM Devices Exhibit Pulse Dependent Behavior and Multi-level Resistance Capability. Mrs Advances 1 (2016), 1--6.
[3]
G. W. Burr et al. 2015. Large-scale neural networks implemented with nonvolatile memory as the synaptic weight element: Comparative performance analysis (accuracy, speed, and power). In IEEE International Electron Devices Meeting. 4.4.1--4.4.4.
[4]
C. H Cheng et al. 2010. Novel Ultra-low power RRAM with good endurance and retention. In VLSI Technology. 85--86.
[5]
Ming Cheng et al. 2017. TIME: A Training-in-memory Architecture for Memristor-based Deep Neural Networks. In Proceedings of the 54th Annual Design Automation Conference 2017. ACM, 26.
[6]
Kaiming He et al. 2016. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.
[7]
Kevin Hsieh et al. 2017. Gaia: Geo-Distributed Machine Learning Approaching LAN Speeds. In 14th USENIX Symposium on Networked Systems Design and Implementation (NSDI 17). USENIX Association, Boston, MA, 629--647.
[8]
Hsu et al. 2013. Self-rectifying bipolar TaOx/TiO2 RRAM with superior endurance over 1012 cycles for 3D high-density storage-class memory. In VLSI Technology. T166--T167.
[9]
Andrej Karpathy et al. 2015. Deep visual-semantic alignments for generating image descriptions. In Computer Vision and Pattern Recognition.
[10]
Yujun Lin et al. 2018. Deep Gradient Compression: Reducing the Communication Bandwidth for Distributed Training. International Conference on Learning Representations (2018).
[11]
Wei Liu et al. 2016. SSD: Single Shot MultiBox Detector. In European Conference on Computer Vision. 21--37.
[12]
Olga Russakovsky et al. 2015. ImageNet Large Scale Visual Recognition Challenge. International Journal of Computer Vision (IJCV) 115, 3 (2015), 211--252.
[13]
Linghao Song et al. 2017. PipeLayer: A Pipelined ReRAM-Based Accelerator for Deep Learning. In IEEE International Symposium on High PERFORMANCE Computer Architecture. 541--552.
[14]
Christian Szegedy et al. 2015. Going deeper with convolutions. In Computer Vision and Pattern Recognition. 1--9.
[15]
M. M. Waldrop. 2016. The chips are down for Moore's law. Nature 530, 7589 (2016), 144.
[16]
Yu Wang et al. 2016. Low power Convolutional Neural Networks on a chip. In IEEE International Symposium on Circuits and Systems. 129--132.
[17]
Lixue Xia et al. 2017. Fault-Tolerant Training with On-Line Fault Detection for RRAM-Based Neural Computing Systems. In Design Automation Conference. 1--6.

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
DAC '18: Proceedings of the 55th Annual Design Automation Conference
June 2018
1089 pages
ISBN:9781450357005
DOI:10.1145/3195970
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

In-Cooperation

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 24 June 2018

Permissions

Request permissions for this article.

Check for updates

Qualifiers

  • Research-article

Funding Sources

  • National Key R&D Program of China
  • Joint Fund of Equipment pre-Research and Ministry of Education
  • National Natural Science Foundation of China
  • Beijing National Research Center for Information Science and Technology

Conference

DAC '18
Sponsor:
DAC '18: The 55th Annual Design Automation Conference 2018
June 24 - 29, 2018
California, San Francisco

Acceptance Rates

Overall Acceptance Rate 1,770 of 5,499 submissions, 32%

Upcoming Conference

DAC '25
62nd ACM/IEEE Design Automation Conference
June 22 - 26, 2025
San Francisco , CA , USA

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)23
  • Downloads (Last 6 weeks)2
Reflects downloads up to 13 Jan 2025

Other Metrics

Citations

Cited By

View all

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media