skip to main content
research-article

Accelerating On-Chip Training with Ferroelectric-Based Hybrid Precision Synapse

Published: 12 January 2022 Publication History

Abstract

In this article, we propose a hardware accelerator design using ferroelectric transistor (FeFET)-based hybrid precision synapse (HPS) for deep neural network (DNN) on-chip training. The drain erase scheme for FeFET programming is incorporated for both FeFET HPS design and FeFET buffer design. By using drain erase, high-density FeFET buffers can be integrated onchip to store the intermediate input-output activations and gradients, which reduces the energy consuming off-chip DRAM access. Architectural evaluation results show that the energy efficiency could be improved by 1.2× ∼ 2.1×, 3.9× ∼ 6.0× compared to the other HPS-based designs and emerging non-volatile memory baselines, respectively. The chip area is reduced by 19% ∼ 36% compared with designs using SRAM on-chip buffer even though the capacity of FeFET buffer is increased. Besides, by utilizing drain erase scheme for FeFET programming, the chip area is reduced by 11% ∼ 28.5% compared with the designs using body erase scheme.

References

[1]
Cheng-Xin Xue et al. 2019. A 1Mb multibit ReRAM computing-in-memory macro with 14.6 ns parallel MAC computing time for CNN-based AI edge processors. In 2019 IEEE International Solid-State Circuits Conference (ISSCC’19). IEEE, 388–390. DOI:https://doi.org/10.1109/ISSCC.2019.8662395
[2]
Sapan Agarwal et al. 2016. Resistive memory device requirements for a neural algorithm accelerator. In 2016 International Joint Conference on Neural Networks (IJCNN’16). ACM, New York, 929–938. DOI:https://doi.org/10.1109/IJCNN.2016.7727298
[3]
Stefano Ambrogio et al. 2018. Equivalent-accuracy accelerated neural-network training using analogue memory. Nature 558, 7708 (June 2018), 60–76. DOI:https://doi.org/10.1038/s41586-018-0180-5
[4]
Yandong Luo and Shimeng Yu. 2020. Accelerating deep neural network in-situ training with non-volatile and volatile memory based hybrid precision synapses. IEEE Transactions on Computers 69, 8 (Aug. 2020), 1113–1127. DOI:https://doi.org/10.1109/TC.2020.3000218
[5]
Arman Kazemi, Ramin Rajaei, Kai Ni, Suman Datta, Michael Niemier, and Sharon Hu. 2020. A hybrid FeMFET-CMOS analog synapse circuit for neural network training and inference. In 2020 IEEE International Symposium on Circuits and Systems (ISCAS’20). IEEE, 1–5. DOI:https://doi.org/10.1109/ISCAS45731.2020.9180722
[6]
Stefan Dünkel et al. 2017. A FeFET-based super-low-power ultra-fast embedded NVM technology for 22nm FDSOI and beyond. In 2017 IEEE International Electron Devices Meeting (IEDM’17). IEEE, 19.7.1–19.7.4. DOI:https://doi.org/10.1109/IEDM.2017.8268425
[7]
Martin Trentzsch et al. 2016. A 28 nm HKMG super low power embedded NVM technology based on ferroelectric FETs. In 2016 IEEE International Electron Devices Meeting (IEDM’16). IEEE, 11.5.1–11.5.4. DOI:https://doi.org/10.1109/IEDM.2016.7838397
[8]
J. Y. Wu et al. 2018. A 40nm low-power logic compatible phase change memory technology. In 2018 IEEE International Electron Devices Meeting (IEDM’18). IEEE, 27.6.1–27.6.4. DOI:https://doi.org/10.1109/IEDM.2018.8614513
[9]
Oleg Golonzka et al. 2019. Non-volatile RRAM embedded into 22FFL FinFET technology. In 2019 Symposium on VLSI Technology (VLSI’19). IEEE, T230–T231. DOI:https://doi.org/10.23919/VLSIT.2019.8776570
[10]
Oleg Golonzka et al. 2018. MRAM as embedded non-volatile memory solution for 22FFL FinFET technology. In 2018 IEEE International Electron Devices Meeting (IEDM’18). IEEE, 18.1.1–18.1.4. DOI:https://doi.org/10.1109/IEDM.2018.8614620
[11]
Yun Long et al. 2018. A ferroelectric FET based power-efficient architecture for data-intensive computing. In 2018 International Conference on Computer-Aided Design (ICCAD’18). ACM, New York, Article 32, 1–8. DOI:https://doi.org/10.1145/3240765.3240770
[12]
Xiaoyu Sun, Panni Wang, Kai Ni, Suman Datta, and Shimeng Yu. 2018. Exploiting hybrid precision for training and inference: A 2T-1FeFET based analog synaptic weight cell. In 2018 IEEE International Electron Devices Meeting (IEDM’18). IEEE, 3.1.1–3.1.4. DOI:https://doi.org/10.1109/IEDM.2018.8614611
[13]
Yandong Luo, Panni Wang, Xiaocheng Peng, Xiaoyu Sun, and Shimeng Yu. 2019. Benchmark of ferroelectric transistor-based hybrid precision synapse for neural network accelerator. IEEE Journal on Exploratory Solid-State Computational Devices and Circuits 5, 2 (Dec. 2019), 142–150. DOI:https://doi.org/10.1109/JXCDC.2019.2925061
[14]
Subramanian Iyer, John Barth, Paul Parries, James Norum, James Rice, Lyndon Logan, and Dennis Hoyniak. 2005. Embedded DRAM: Technology platform for the Blue Gene/L chip. IBM Journal of Research and Development 49, 2.3 (Mar. 2005), 333–350. DOI:https://doi.org/10.1147/rd.492.0333
[15]
Fengbin Tu, Weiwei Wu, Shouyi Yin, Leibo Liu, and Shaojun Wei. 2018. RANA: Towards efficient neural acceleration with refresh-optimized embedded DRAM. In 2018 ACM/IEEE 45th Annual International Symposium on Computer Architecture (ISCA’18). IEEE, 340–352. DOI:https://doi.org/10.1109/ISCA.2018.00037
[16]
Ming Cheng, Lixue Xia, Zhenhua Zhu, Yi Cai, Yuan Xie, Yu Wang, and Huazhong Yang. 2019. TIME: A training-in-memory architecture for RRAM-based deep neural networks. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems 38, 5 (May 2019), 834–847. DOI:https://doi.org/10.1109/TCAD.2018.2824304
[17]
Linghao Song, Xuehai Qian, Hai Li, and Yiran Chen. 2017. Pipelayer: A pipelined ReRAM-based accelerator for deep learning. In 2017 IEEE International Symposium on High Performance Computer Architecture (HPCA’17). IEEE, 541–552. DOI:https://doi.org/10.1109/HPCA.2017.55
[18]
Dayane Reis, Kai Ni, Wriddhi Chakraborty, Xunzhao Yin, Martin Trentzsch, Stefan Dünkel, Thomas Melde, et al. 2019. Design and analysis of an ultra-dense, low-leakage, and fast FeFET-based random access memory array. IEEE Journal on Exploratory Solid-State Computational Devices and Circuits 5, 2 (Dec. 2019), 103–112. DOI:https://doi.org/10.1109/JXCDC.2019.2930284
[19]
Panni Wang, Zheng Wang, Wonbo Shim, Jae Hur, Suman Datta, Asif Islam Khan, and Shimeng Yu. 2020. Drain-erase scheme in ferroelectric field-effect transistor—Part I: Device characterization. IEEE Transactions on Electron Devices 67, 3 (Mar. 2020), 955–961. DOI:https://doi.org/10.1109/TED.2020.2969401
[20]
Panni Wang, Zheng Wang, Xiaoyu Sun, Jae Hur, Suman Datta, Asif Islam Khan, and Shimeng. Yu. 2020. Investigating ferroelectric minor loop dynamics and history effect—Part I: Device characterization. IEEE Trans. Electron Devices 67, 9 (Sept. 2020), 3592–3597. DOI:https://doi.org/10.1109/TED.2020.3009623
[21]
Lei Deng, Guoqi Li, Song Han, Luping Shi, and Yuan Xie. 2020. Model compression and hardware acceleration for neural networks: a comprehensive survey. Proceedings of the IEEE 108, 4 (Apr. 2020), 485–532. DOI:https://doi.org/10.1109/JPROC.2020.2976475
[22]
Xiaochen Peng, Rui Liu, and Shimeng Yu. 2020. Optimizing weight mapping and data flow for convolutional neural networks on processing-in-memory architectures. IEEE Transactions on Circuits and Systems I: Regular Papers 67, 4 (Apr. 2019), 1333–1343. DOI:https://doi.org/10.1109/TCSI.2019.2958568
[23]
Tianshi Chen, Zidong Du, Ninghui Sun, Jia Wang, Chengyong Wu, Yunji Chen, and Olivier Temam. 2014. DianNao: A small-footprint high-throughput accelerator for ubiquitous machine-learning. In 19th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS’14). ACM, New York, 269–284. DOI:https://doi.org/10.1145/2541940.2541967
[24]
Xiaochen Peng, Shanshi Huang, Yandong Luo, Xiaoyu Sun, and Shimeng Yu. 2019. DNN+NeuroSim: An end-to-end benchmarking framework for compute-in-memory accelerators with versatile device technologies. In 2019 IEEE International Electron Devices Meeting (IEDM’19). IEEE, 32.5.1–32.5.4. DOI:https://doi.org/10.1109/IEDM19573.2019.8993491
[25]
Shihui Yin, Yulhwa Kim, Xu Han, Hugh Barnaby, Shimeng Yu, Yandong Luo, Wangxin He, Xiaoyu Sun, Jae-Joon Kim, and Jae-sun Seo. 2019. Monolithically integrated RRAM-and CMOS-based in-memory computing optimizations for efficient deep learning. IEEE Micro 39, 6 (Nov. 2019), 54–63. DOI:https://doi.org/10.1109/MM.2019.2943047
[26]
Yoon-Jong Song et al. 2016. Highly functional and reliable 8Mb STT-MRAM embedded in 28nm logic. In 2016 IEEE International Electron Devices Meeting (IEDM’16). IEEE, 27.2.1–27.2.4. DOI:https://doi.org/10.1109/IEDM.2016.7838491
[27]
Young-Ho Kim et al. 2011. Integration of 28nm MJT for 8∼16Gb level MRAM with full investigation of thermal stability. In 2011 Symposium on VLSI Technology (VLSI’11). IEEE, 210–211.
[28]
Hongwu Jiang, Xiaochen Peng, Shanshi Huang, and Shimeng Yu. 2020. CIMAT: A compute-in-memory architecture for on-chip training based on transpose SRAM arrays. IEEE Transactions on Computers 69, 7 (July 2020), 944–954. DOI:https://doi.org/10.1109/TC.2020.2980533
[29]
Matthew Jerry, Pai-Yu Chen, Jianchi Zhang, Pankaj Sharma, Kai Ni, Shimeng Yu, and Suman Datta. 2017. Ferroelectric FET analog synapse for acceleration of deep neural network training. 2017 IEEE International Electron Devices Meeting (IEDM’17). IEEE, 6.2.1–6.2.4. DOI:https://doi.org/10.1109/IEDM.2017.8268338
[30]
Shuang Wu, Guoqi Li, Feng Chen, and Luping Shi. 2018. Training and inference with integers in deep neural networks. In International Conference on Learning Representations (ICLR’18).
[31]
A. A. Sharma et al. 2020. High speed memory operation in channel-last, back-gated ferroelectric transistors. 2020 IEEE International Electron Devices Meeting (IEDM’20). IEEE, 18.5.1–18.5.4. DOI:https://doi.org/10.1109/IEDM13553.2020.9371940

Cited By

View all

Index Terms

  1. Accelerating On-Chip Training with Ferroelectric-Based Hybrid Precision Synapse

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Journal on Emerging Technologies in Computing Systems
      ACM Journal on Emerging Technologies in Computing Systems  Volume 18, Issue 2
      April 2022
      411 pages
      ISSN:1550-4832
      EISSN:1550-4840
      DOI:10.1145/3508462
      • Editor:
      • Ramesh Karri
      Issue’s Table of Contents

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Journal Family

      Publication History

      Published: 12 January 2022
      Accepted: 01 June 2021
      Revised: 01 March 2021
      Received: 01 July 2020
      Published in JETC Volume 18, Issue 2

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. Deep neural network
      2. emerging non-volatile memory
      3. ferroelectric field effect transistor (FeFET)
      4. DNN hardware acceleration
      5. in-memory computing
      6. on-chip training

      Qualifiers

      • Research-article
      • Refereed

      Funding Sources

      • ASCENT
      • SRC/DRAPA JUMP
      • SONY

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)189
      • Downloads (Last 6 weeks)16
      Reflects downloads up to 20 Jan 2025

      Other Metrics

      Citations

      Cited By

      View all

      View Options

      Login options

      Full Access

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Full Text

      View this article in Full Text.

      Full Text

      HTML Format

      View this article in HTML Format.

      HTML Format

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media