research-article

Dark silicon and the end of multicore scaling

Authors:

Hadi Esmaeilzadeh,

Renee St. Amant,

Karthikeyan Sankaralingam,

Doug BurgerAuthors Info & Claims

ISCA '11: Proceedings of the 38th annual international symposium on Computer architecture

Pages 365 - 376

https://doi.org/10.1145/2000064.2000108

Published: 04 June 2011 Publication History

Abstract

Since 2005, processor designers have increased core counts to exploit Moore's Law scaling, rather than focusing on single-core performance. The failure of Dennard scaling, to which the shift to multicore parts is partially a response, may soon limit multicore scaling just as single-core scaling has been curtailed. This paper models multicore scaling limits by combining device scaling, single-core scaling, and multicore scaling to measure the speedup potential for a set of parallel workloads for the next five technology generations. For device scaling, we use both the ITRS projections and a set of more conservative device scaling parameters. To model single-core scaling, we combine measurements from over 150 processors to derive Pareto-optimal frontiers for area/performance and power/performance. Finally, to model multicore scaling, we build a detailed performance model of upper-bound performance and lower-bound core power. The multicore designs we study include single-threaded CPU-like and massively threaded GPU-like multicore chip organizations with symmetric, asymmetric, dynamic, and composed topologies. The study shows that regardless of chip organization and topology, multicore scaling is power limited to a degree not widely appreciated by the computing community. Even at 22 nm (just one year from now), 21% of a fixed-size chip must be powered off, and at 8 nm, this number grows to more than 50%. Through 2024, only 7.9x average speedup is possible across commonly used parallel workloads, leaving a nearly 24-fold gap from a target of doubled performance per generation.

Supplementary Material

JPG File (isca_8a_2.jpg)

Download
11.00 KB

MP4 File (isca_8a_2.mp4)

Download
119.78 MB

References

[1]

G. M. Amdahl. Validity of the single processor approach to achieving large-scale computing capabilities. In AFIPS '67.

Digital Library

[2]

O. Azizi, A. Mahesri, B. C. Lee, S. J. Patel, and M. Horowitz. Energy-performance tradeoffs in processor architecture and circuit design: a marginal cost analysis. In ISCA '10.

Digital Library

[3]

A. Bakhoda, G. L. Yuan, W. W. L. Fung, H. Wong, and T. M. Aamodt. Analyzing CUDA workloads using a detailed GPU simulator. In ISPASS '09.

[4]

M. Bhadauria, V. Weaver, and S. McKee. Understanding PARSEC performance on contemporary CMPs. In IISWC '09.

Digital Library

[5]

C. Bienia, S. Kumar, J. P. Singh, and K. Li. The PARSEC benchmark suite: Characterization and architectural implications. In PACT '08.

Digital Library

[6]

S. Borkar. Thousand core chips: a technology perspective. In DAC '07.

Digital Library

[7]

S. Borkar. The exascale challenge. Keynote at International Symposium on VLSI Design, Automation and Test (VLSI-DAT), 2010.

[8]

K. Chakraborty. Over-provisioned Multicore Systems. PhD thesis, University of Wisconsin-Madison, 2008.

Digital Library

[9]

S. Cho and R. Melhem. Corollaries to Amdahl's law for energy. Computer Architecture Letters, 7 (1), January 2008.

Digital Library

[10]

E. S. Chung, P. A. Milder, J. C. Hoe, and K. Mai. Single-chip heterogeneous computing: Does the future include custom logic, FPGAs, and GPUs? In phMICRO '10.

Digital Library

[11]

R. H. Dennard, F. H. Gaensslen, V. L. Rideout, E. Bassous, and A. R. LeBlanc. Design of ion-implanted mosfet's with very small physical dimensions. IEEE Journal of Solid-State Circuits, 9, October 1974.

[12]

H. Esmaeilzadeh, T. Cao, Y. Xi, S. M. Blackburn, and K. S. McKinley. Looking back on the language and hardware revolutions: measured power, performance, and scaling. In ASPLOS '11.

Digital Library

[13]

Z. Guz, E. Bolotin, I. Keidar, A. Kolodny, A. Mendelson, and U. C. Weiser. Many-core vs. many-thread machines: Stay away from the valley. IEEE Computer Architecture Letters, 8, January 2009.

Digital Library

[14]

M. Hempstead, G.-Y. Wei, and D. Brooks. Navigo: An early-stage model to study power-contrained architectures and specialization. In MoBS '09.

[15]

M. D. Hill and M. R. Marty. Amdahl's law in the multicore era. Computer, 41 (7), July 2008.

Digital Library

[16]

M. Horowitz, E. Alon, D. Patil, S. Naffziger, R. Kumar, and K. Bernstein. Scaling, power, and the future of CMOS. In IEDM '05.

[17]

E. Ipek, M. Kirman, N. Kirman, and J. F. Martinez. Core fusion: accommodating software diversity in chip multiprocessors. In ISCA '07.

Digital Library

[18]

ITRS. International technology roadmap for semiconductors, 2010 update, 2011. URL http://www.itrs.net.

[19]

C. Kim, S. Sethumadhavan, M. S. Govindan, N. Ranganathan, D. Gulati, D. Burger, and S. W. Keckler. Composable lightweight processors. In MICRO '07.

Digital Library

[20]

Lee, Jung, and Shin}LeeJ.-G. Lee, E. Jung, and W. Shin. An asymptotic performance/energy analysis and optimization of multi-core architectures. In ICDCN '09,.

Digital Library

[21]

Lee:ISCA10V. W. Lee et al. Debunking the 100X GPU vs. CPU myth: an evaluation of throughput computing on CPU and GPU. In ISCA '10,.

Digital Library

[22]

G. Loh. The cost of uncore in throughput-oriented many-core processors. In ALTA '08.

[23]

G. E. Moore. Cramming more components onto integrated circuits. phElectronics, 38 (8), April 1965.

[24]

K. Nose and T. Sakurai. Optimization of VDD and VTH for low-power and high speed applications. In phASP-DAC '00.

Digital Library

[25]

SPEC. Standard performance evaluation corporation, 2011. URL http://www.spec.org.

[26]

A. M. Suleman, O. Mutlu, M. K. Qureshi, and Y. N. Patt. Accelerating critical section execution with asymmetric multi-core architectures. In ASPLOS '09.

Digital Library

[27]

G. Venkatesh, J. Sampson, N. Goulding, S. Garcia, V. Bryksin, J. Lugo-Martinez, S. Swanson, and M. B. Taylor. Conservation cores: reducing the energy of mature computations. In ASPLOS '10.

Digital Library

[28]

D. H. Woo and H.-H. S. Lee. Extending Amdahl's law for energy-efficient computing in the many-core era. Computer, 41 (12), December 2008.

Digital Library

Cited By

Wu YGuo HZhang BQiu JYang ZWu J(2024)Integrated photonic modular arithmetic processorPhotonics Research10.1364/PRJ.52776212:11(2676)Online publication date: 1-Nov-2024
https://doi.org/10.1364/PRJ.527762
Tai YNwachukwu PLePage BFang W(2024)Examining customer intentions to purchase intelligent robotic products and services in Taiwan using the theory of planned behaviourBMC Psychology10.1186/s40359-024-01683-z12:1Online publication date: 15-Jun-2024
https://doi.org/10.1186/s40359-024-01683-z
Lu ANarendra Agrawal JFang Z(2024)SQL2FPGA: Automated Acceleration of SQL Query Processing on Modern CPU-FPGA PlatformsACM Transactions on Reconfigurable Technology and Systems10.1145/367484317:3(1-28)Online publication date: 2-Jul-2024
https://dl.acm.org/doi/10.1145/3674843
Show More Cited By

Index Terms

Dark silicon and the end of multicore scaling
1. Computer systems organization
  1. Architectures
2. Hardware
  1. Electronic design automation
    1. Modeling and parameter extraction

Recommendations

Dark silicon and the end of multicore scaling
ISCA '11

Since 2005, processor designers have increased core counts to exploit Moore's Law scaling, rather than focusing on single-core performance. The failure of Dennard scaling, to which the shift to multicore parts is partially a response, may soon limit ...
Power Limitations and Dark Silicon Challenge the Future of Multicore

Since 2004, processor designers have increased core counts to exploit Moore’s Law scaling, rather than focusing on single-core performance. The failure of Dennard scaling, to which the shift to multicore parts is partially a response, may soon limit ...
Voltage scaling and dark silicon in symmetric multicore processors

As technology scales further, multicore and many-core processors emerge as an alternative to keep up with performance demands. However, because of power and thermal constraints, we are obliged to power off remarkable area of chip. Many innovative ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

ISCA '11: Proceedings of the 38th annual international symposium on Computer architecture

June 2011

488 pages

ISBN:9781450304726

DOI:10.1145/2000064

General Chairs:
Ravi Iyer
Intel
,
Qing Yang
University of Rhode Island
,
Program Chair:
Antonio González
Intel and UPC

ACM SIGARCH Computer Architecture News Volume 39, Issue 3
ISCA '11
June 2011
462 pages
ISSN:0163-5964
DOI:10.1145/2024723
Issue’s Table of Contents

Copyright © 2011 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 04 June 2011

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

ISCA '11

Sponsor:

SIGARCH

ISCA '11: The 38th Annual International Symposium on Computer Architecture

June 4 - 8, 2011

California, San Jose, USA

Acceptance Rates

Overall Acceptance Rate 543 of 3,203 submissions, 17%

Upcoming Conference

ISCA '25

Sponsor:
sigarch

The 52nd Annual International Symposium on Computer Architecture

June 21 - 25, 2025

Tokyo , Japan

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

906
Total Citations
View Citations
6,440
Total Downloads

Downloads (Last 12 months)418
Downloads (Last 6 weeks)73

Reflects downloads up to 06 Nov 2024

Other Metrics

View Author Metrics

Citations

Cited By

Wu YGuo HZhang BQiu JYang ZWu J(2024)Integrated photonic modular arithmetic processorPhotonics Research10.1364/PRJ.52776212:11(2676)Online publication date: 1-Nov-2024
https://doi.org/10.1364/PRJ.527762
Tai YNwachukwu PLePage BFang W(2024)Examining customer intentions to purchase intelligent robotic products and services in Taiwan using the theory of planned behaviourBMC Psychology10.1186/s40359-024-01683-z12:1Online publication date: 15-Jun-2024
https://doi.org/10.1186/s40359-024-01683-z
Lu ANarendra Agrawal JFang Z(2024)SQL2FPGA: Automated Acceleration of SQL Query Processing on Modern CPU-FPGA PlatformsACM Transactions on Reconfigurable Technology and Systems10.1145/367484317:3(1-28)Online publication date: 2-Jul-2024
https://dl.acm.org/doi/10.1145/3674843
Chen SCai CZheng SLi JZhu GLi JYan YDai YYin WWang L(2024)HierCGRA: A Novel Framework for Large-scale CGRA with Hierarchical Modeling and Automated Design Space ExplorationACM Transactions on Reconfigurable Technology and Systems10.1145/365617617:2(1-31)Online publication date: 8-Apr-2024
https://dl.acm.org/doi/10.1145/3656176
Lin WShan YKosta RKrishnamurthy AZhang YZhang ZPutnam A(2024)SuperNIC: An FPGA-Based, Cloud-Oriented SmartNICProceedings of the 2024 ACM/SIGDA International Symposium on Field Programmable Gate Arrays10.1145/3626202.3637564(130-141)Online publication date: 1-Apr-2024
https://dl.acm.org/doi/10.1145/3626202.3637564
Zhou ZGogte VVaish NKennelly CXia PKanev SMoseley TDelimitrou CRanganathan PTsafrir DMusuvathi MGupta RAbu-Ghazaleh N(2024)Characterizing a Memory Allocator at Warehouse ScaleProceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 310.1145/3620666.3651350(192-206)Online publication date: 27-Apr-2024
https://dl.acm.org/doi/10.1145/3620666.3651350
Mahapatra RGhodrati SAhn BKinzer SWang SXu HKarthikeyan LSharma HYazdanbakhsh AAlian MEsmaeilzadeh HTsafrir DMusuvathi MGupta RAbu-Ghazaleh N(2024)In-Storage Domain-Specific Acceleration for Serverless ComputingProceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 210.1145/3620665.3640413(530-548)Online publication date: 27-Apr-2024
https://dl.acm.org/doi/10.1145/3620665.3640413
Kuper RJeong IYuan YWang RRanganathan NRao NHu JKumar SLantz PKim NTsafrir DMusuvathi MGupta RAbu-Ghazaleh N(2024)A Quantitative Analysis and Guidelines of Data Streaming Accelerator in Modern Intel Xeon Scalable ProcessorsProceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 210.1145/3620665.3640401(37-54)Online publication date: 27-Apr-2024
https://dl.acm.org/doi/10.1145/3620665.3640401
Huang JLou JVanavasam SKong XJi HJeong IZhuo DLee EKim N(2024)HAL: Hardware-assisted Load Balancing for Energy-efficient SNIC-Host Cooperative Computing2024 ACM/IEEE 51st Annual International Symposium on Computer Architecture (ISCA)10.1109/ISCA59077.2024.00051(613-627)Online publication date: 29-Jun-2024
https://doi.org/10.1109/ISCA59077.2024.00051
Wang JBerger DKazhamiaka FIrvene CZhang CChoukse EFrost KFonseca RWarrier BBansal CStern JBianchini RSriraman A(2024)Designing Cloud Servers for Lower Carbon2024 ACM/IEEE 51st Annual International Symposium on Computer Architecture (ISCA)10.1109/ISCA59077.2024.00041(452-470)Online publication date: 29-Jun-2024
https://doi.org/10.1109/ISCA59077.2024.00041
Show More Cited By

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents