skip to main content
10.1145/2591635.2667187acmconferencesArticle/Chapter ViewAbstractPublication PagesicsConference Proceedingsconference-collections
research-article

Design tradeoffs for tiled CMP on-chip networks

Published: 28 June 2006 Publication History

Abstract

We develop detailed area and energy models for on-chip interconnection networks and describe tradeoffs in the design of efficient networks for tiled chip multiprocessors. Using these detailed models we investigate how aspects of the network architecture including topology, channel width, routing strategy, and buffer size affect performance and impact area and energy efficiency. We simulate the performance of a variety of on-chip networks designed for tiled chip multiprocessors implemented in an advanced VLSI process and compare area and energy efficiencies estimated from our models. We demonstrate that the introduction of a second parallel network can increase performance while improving efficiency, and evaluate different strategies for distributing traffic over the subnetworks. Drawing on insights from our analysis, we present a concentrated mesh topology with replicated subnetworks and express channels which provides a 24% improvement in area efficiency and a 48% improvement in energy efficiency over other networks evaluated in this study.

References

[1]
A. Adriahantenaina, H. Charlery, A. Greiner, L. Mortiez, and C. A. Zeferino. Spin: A scalable, packet switched, on-chip micro-network. In DATE '03: Proceedings of the conference on Design, Automation and Test in Europe, page 20070, Washington, DC, USA, 2003. IEEE Computer Society.
[2]
P. Bai et al. A 65nm logic technology featuring 35nm gate lengths, enhanced channel strain, 8 cu interconnect layers, low-k ild and 0.57 mu;m2 sram cell. In Electronic Devices Meeting, 2004. IEDM Technical Digest, pages 657--660. IEEE International, Dec 2004.
[3]
A. Chatterjee et al. A 65 nm cmos technology for mobile and digital signal processing applications. In Electronic Devices Meeting, 2004. IEDM Technical Digest, pages 665--668. IEEE International, Dec 2004.
[4]
W. J. Dally and B. Towles. Route packets, not wires: on-chip inteconnectoin networks. In DAC '01: Proceedings of the 38th conference on Design automation, pages 684--689, New York, NY, USA, 2001. ACM Press.
[5]
W. J. Dally and B. Towles. Principles and Practices of Interconnection Networks. Morgan Kaufmann Publishers, 2004.
[6]
J. Duato. A new theory of deadlock-free adaptive routing in wormhole networks. IEEE Trans. Parallel Distrib. Syst., 4(12):1320--1331, 1993.
[7]
N. Eisley and L.-S. Peh. High-level power analysis for on-chip networks. In CASES '04: Proceedings of the 2004 international conference on Compilers, architecture, and synthesis for embedded systems, pages 104--115, New York, NY, USA, 2004. ACM Press.
[8]
R. Ho, K. Mai, and M. Horowitz. The future of wires. In Proceedings of the IEEE, volume 89, pages 490--504. IEEE, April 2001.
[9]
R. Ho, K. Mai, and M. Horowitz. Managing wire scaling: a circuit perspective. In Proceedings of the IEEE 2003 International Interconnect Technology Conference, pages 177--179, June 2003.
[10]
International technology roadmap for semiconductors. 2005 edition.
[11]
J. Kim, D. Park, T. Theocharides, N. Vijaykrishnan, and C. R. Das. A low latency router supporting adaptivity for on-chip interconnects. In DAC '05: Proceedings of the 42nd annual conference on Design automation, pages 559--564, New York, NY, USA, 2005. ACM Press.
[12]
C. E. Leiserson. Fat-trees: universal networks for hardware-efficient supercomputing. IEEE Trans. Comput., 34(10):892--901, 1985.
[13]
Z. Luo et al. High performance and low power transistors integrated in 65nm bulk cmos technology. In Electronic Devices Meeting, 2004. IEDM Technical Digest, pages 661--664. IEEE International, Dec 2004.
[14]
M. L. Mui, K. Banerjee, and A. Mehrotra. A global interconnect optimization scheme for nanometer scale vlsi with implications for latency, bandwidth, and power dissipation. In IEEE Transactions on Electron Devices, volume 51, pages 195--202. IEEE, February 2004.
[15]
R. Mullins, A. West, and S. Moore. Low-latency virtual-channel routers for on-chip networks. In ISCA '04: Proceedings of the 31st annual international symposium on Computer architecture, page 188, Washington, DC, USA, 2004. IEEE Computer Society.
[16]
S. R. Öhring, M. Ibel, S. K. Das, and M. J. Kumar. On generalized fat trees. In IPPS '95: Proceedings of the 9th International Symposium on Parallel Processing, page 37, Washington, DC, USA, 1995. IEEE Computer Society.
[17]
K. Olukotun, B. A. Nayfeh, L. Hammond, K. Wilson, and K. Chang. The case for a single-chip multiprocessor. SIGOPS Oper. Syst. Rev., 30(5):2--11, 1996.
[18]
L.-S. Peh and W. J. Dally. A delay model and speculative architecture for pipelined routers. In HPCA '01: Proceedings of the 7th International Symposium on High-Performance Computer Architecture, page 255, Washington, DC, USA, 2001. IEEE Computer Society.
[19]
D. Seo, A. Ali, W.-T. Lim, N. Rafique, and M. Thottethodi. Near-optimal worst-case throughput routing for two-dimensional mesh networks. SIGARCH Comput. Archit. News, 33(2):432--443, 2005.
[20]
M. B. Taylor, J. Kim, J. Miller, D. Wentzlaff, F. Ghodrat, B. Greenwald, H. Hoffman, P. Johnson, J.-W. Lee, W. Lee, A. Ma, A. Saraf, M. Seneski, N. Shnidman, V. Strumpen, M. Frank, S. Amarasinghe, and A. Agarwal. The raw microprocessor: A computational fabric for software circuits and general-purpose programs. IEEE Micro, 22(2):25--35, 2002.
[21]
H. Wang, L.-S. Peh, and S. Malik. Power-driven design of router microarchitectures in on-chip networks. In MICRO 36: Proceedings of the 36th annual IEEE/ACM International Symposium on Microarchitecture, page 105, Washington, DC, USA, 2003. IEEE Computer Society.
[22]
W. Zhao and Y. Cao. New generation of predictive technology model for sub-45nm design exploration. ISQED, 0:585--590, 2006.

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
ACM International Conference on Supercomputing 25th Anniversary Volume
June 2014
94 pages
ISBN:9781450328401
DOI:10.1145/2591635
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 28 June 2006

Permissions

Request permissions for this article.

Check for updates

Qualifiers

  • Research-article

Acceptance Rates

Overall Acceptance Rate 629 of 2,180 submissions, 29%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)83
  • Downloads (Last 6 weeks)8
Reflects downloads up to 03 Jan 2025

Other Metrics

Citations

Cited By

View all

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media