skip to main content
10.1145/1815961.1816012acmconferencesArticle/Chapter ViewAbstractPublication PagesiscaConference Proceedingsconference-collections
research-article

Resistive computation: avoiding the power wall with low-leakage, STT-MRAM based computing

Published: 19 June 2010 Publication History

Abstract

As CMOS scales beyond the 45nm technology node, leakage concerns are starting to limit microprocessor performance growth. To keep dynamic power constant across process generations, traditional MOSFET scaling theory prescribes reducing supply and threshold voltages in proportion to device dimensions, a practice that induces an exponential increase in subthreshold leakage. As a result, leakage power has become comparable to dynamic power in current-generation processes, and will soon exceed it in magnitude if voltages are scaled down any further. Beyond this inflection point, multicore processors will not be able to afford keeping more than a small fraction of all cores active at any given moment. Multicore scaling will soon hit a power wall.
This paper presents resistive computation, a new technique that aims at avoiding the power wall by migrating most of the functionality of a modern microprocessor from CMOS to spin-torque transfer magnetoresistive RAM (STT-MRAM)---a CMOS-compatible, leakage-resistant, non-volatile resistive memory technology. By implementing much of the on-chip storage and combinational logic using leakage-resistant, scalable RAM blocks and lookup tables, and by carefully re-architecting the pipeline, an STT-MRAM based implementation of an eight-core Sun Niagara-like CMT processor reduces chip-wide power dissipation by 1.7× and leakage power by 2.1× at the 32nm technology node, while maintaining 93% of the system throughput of a CMOS-based design.

References

[1]
V. Agarwal, M. Hrishikesh, S. Keckler, and D. Burger. Clock rate vs. IPC: End of the road for conventional microprocessors. In International Symposium on Computer Architecture, Vancouver, Canada, June 2000.
[2]
ALTERA. Stratix vs. Virtex-2 Pro FPGA performance analysis, 2004.
[3]
B. Amrutur and M. Horowitz. Speed and power scaling of SRAMs. 2000.
[4]
D. Burger, J. R. Goodman, and A. Kagi. Memory bandwidth limitations of future microprocessors. In International Symposium on Computer Architecture, Philedelphia, PA, May 1996.
[5]
E. Catovic. GRFPU-high performance IEEE-754 floating-point unit. http://www.gaisler.com/doc/grfpu_dasia.pdf.
[6]
C. Chappert, A. Fert, and F. N. V. Dau. The emergence of spin electronics in data storage. Nature Materials, 6:813--823, November 2007.
[7]
M. D. Ciletti. Advanced Digital Design with the Verilog HDL. 2004.
[8]
D. Suzuki et al. Fabrication of a nonvolatile lookup table circuit chip using magneto/semiconductor hybrid structure for an immediate power up field programmable gate array. In Symposium on VLSI Circuits, 2009.
[9]
R. Desikan, C. R. Lefurgy, S. W. Keckler, and D. C. Burger. On-chip MRAM as a high-bandwidth, low-latency replacement for DRAM physical memories. In IBM Austin Center for Advanced Studies Conference, 2003.
[10]
X. Dong, X. Wu, G. Sun, H. Li, Y. Chen, and Y. Xie. Circuit and mircoarchitecture evaluation of 3D stacking magnetic RAM (MRAM) as a universal memory replacement. In Design Automation Conference, 2008.
[11]
HiTech. DDR2 memory controller IP core for FPGA and ASIC. http://www.hitechglobal.com/IPCores/DDR2Controller.htm.
[12]
Y. Huai. Spin-transfer torque MRAM (STT-MRAM) challenges and prospects. AAPPS Bulletin, 18(6):33--40, December 2008.
[13]
ITRS. International Technology Roadmap for Semiconductors: 2009 Executive Summary. http://www.itrs.net/Links/2009ITRS/Home2009.htm.
[14]
K. Tsuchida et al. A 64Mb MRAM with clamped-reference and adequate-reference schemes. In Proceedings of the IEEE International Solid-State Circuits Conference, 2010.
[15]
G. Kane. MIPS RISC Architecture. 1988.
[16]
U. R. Karpuzcu, B. Greskamp, and J. Torellas. The bubblewrap many-core: Popping cores for sequential acceleration. In International Symposium on Microarchitecutre, 2009.
[17]
P. Kongetira, K. Aingaran, and K. Olukotun. Niagara: A 32-way multithreaded sparc processor. IEEE Micro, 25(2):21--29, 2005.
[18]
B. Lee, E. Ipek, O. Mutlu, and D. Burger. Architecting phase-change memory as a scalable dram alternative. In International Symposium on Computer Architecture, Austin, TX, June 2009.
[19]
S. Li, J. H. Ahn, R. D. Strong, J. B. Brockman, D. M. Tullsen, and N. P. Jouppi. McPAT: An integrated power, area, and timing modeling framework for multicore and manycore architectures. In International Symposium on Computer Architecture, 2009.
[20]
M. Hosomi and H. Yamagishi and T. Yamamoto and K. Bessha et al. A novel nonvolatile memory with spin torque transfer magnetization switching: Spin-RAM. In IEDM Technical Digest, pages 459--462, 2005.
[21]
Micron. 512Mb DDR2 SDRAM Component Data Sheet: MT47H128M4B6-25, March 2006. http://download.micron.com/pdf/datasheets/dram/ddr2/512MbDDR2.pdf.
[22]
N. Muralimanohar, R. Balasubramonian, and N. Jouppi. Optimizing NUCA organizations and wiring alternatives for large caches with CACTI 6.0. Chicago, IL, Dec. 2007.
[23]
N. Muralimanohar, R. Balasubramonian, and N. P. Jouppi. NAS parallel benchmarks. Technical report, NASA Ames Research Center, March 1994. Tech. Rep. RNR-94-007.
[24]
U. G. Nawathe, M. Hassan, K. C. Yen, A. Kumar, A. Ramachandran, and D. Greenhill. Implementation of an 8-core, 64-thread, power-efficient sparc server on a chip. IEEE Journal of Solid-State Circuits, 43(1):6-20, January 2008.
[25]
J. Renau, B. Fraguela, J. Tuck, W. Liu, M. Prvulovic, L. Ceze, S. Sarangi, P. Sack, K. Strauss, and P. Montesinos. SESC simulator, January 2005. http://sesc.sourceforge.net.
[26]
S. Matsunaga et al. Fabrication of a nonvolatile full adder based on logic-in-memory architecture using magnetic tunnel junctions. Applied Physics Express, 1(9), 2008.
[27]
S. Rusu et al. A 45nm 8-Core Enterprise Xeon Processor. In Proceedings of the IEEE International Solid-State Circuits Conference, pages 56--57, Feb. 2009.
[28]
Sanu K. Mathew and Mark A. Anders and Brad Bloechel et al. A 4-GHz 300-mW 64-bit integer execution ALU with dual supply voltages in 90-nm CMOS. IEEE Journal of Solid-State Circuits, 40(1):44--51, January 2005.
[29]
J. E. Stine, I. Castellanos, M. Wood, J. Henson, and F. Love. Freepdk: An open-source variation-aware design kit. In International Conference on Microelectronic Systems Education, 2007. http://vcag.ecen.okstate.edu/projects/scells/.
[30]
G. Sun, X. Dong, Y. Xie, J. Li, and Y. Chen. A novel 3D stacked MRAM cache architecture for CMPs. In High-Performance Computer Architecture, 2009.
[31]
T. Kawahara et al. 2 Mb SPRAM (spin-transfer torque RAM) with bit-by-bit bi-directional current write and parallelizing-direction current read. IEEE Journal of Solid-State Circuits, 43(1):109--120, January 2008.
[32]
T. Kishi and H. Yoda and T. Kai et al. Lower-current and fast switching of a perpendicular TMR for high speed and high density spin-transfer-torque MRAM. In IEEE International Electron Devices Meeting, 2008.
[33]
U. K. Klostermann et al. A perpendicular spin torque switching based MRAM for the 28 nm technology node. In IEEE International Electron Devices Meeting, 2007.
[34]
X. Wu, J. Li, L. Zhang, E. Speight, R. Rajamony, and Y. Xie. Hybrid cache architecture with disparate memory technologies. In International Symposium on Computer Architecture, 2009.
[35]
Xilinx. Virtex-6 FPGA Family Overview, November 2009. http://www.xilinx.com/support/documentation/data_sheets/ ds150.pdf.
[36]
W. Xu, Y. Chen, X. Wang, and T. Zhang. Improving STT MRAM storage density through smaller-than-worst-case transistor sizing. In Design Automation Conference, 2009.
[37]
W. Xu, T. Zhang, and Y. Chen. Spin-transfer torque magnetoresistive content addressable memory (CAM) cell structure design with enhanced search noise margin. In International Symposium on Circuits and Systems, 2008.
[38]
W. Zhao and Y. Cao. New generation of predictive technology model for sub-45nm design exploration. In International Symposium on Quality Electronic Design, 2006. http://ptm.asu.edu/.
[39]
W. Zhao, C. Chappert, and P. Mazoyer. Spin transfer torque (STT) MRAM-based runtime reconfiguration FPGA circuit. In ACM Transactions on Embedded Computing Systems, 2009.
[40]
P. Zhou, B. Zhao, J. Yang, and Y. Zhang. Energy reduction for STT-RAM using early write termination. In International Conference on Computer--Aided Design, 2009.

Cited By

View all

Index Terms

  1. Resistive computation: avoiding the power wall with low-leakage, STT-MRAM based computing

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Conferences
      ISCA '10: Proceedings of the 37th annual international symposium on Computer architecture
      June 2010
      520 pages
      ISBN:9781450300537
      DOI:10.1145/1815961
      • cover image ACM SIGARCH Computer Architecture News
        ACM SIGARCH Computer Architecture News  Volume 38, Issue 3
        ISCA '10
        June 2010
        508 pages
        ISSN:0163-5964
        DOI:10.1145/1816038
        Issue’s Table of Contents
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Sponsors

      In-Cooperation

      • IEEE CS

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 19 June 2010

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. STT-MRAM
      2. power-efficiency

      Qualifiers

      • Research-article

      Conference

      ISCA '10
      Sponsor:

      Acceptance Rates

      Overall Acceptance Rate 543 of 3,203 submissions, 17%

      Upcoming Conference

      ISCA '25

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)70
      • Downloads (Last 6 weeks)15
      Reflects downloads up to 09 Jan 2025

      Other Metrics

      Citations

      Cited By

      View all

      View Options

      Login options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media