skip to main content
10.5555/1898953.1898997acmotherconferencesArticle/Chapter ViewAbstractPublication PagesidpdsConference Proceedingsconference-collections
Article

A study of the on-chip interconnection network for the IBM Cyclops64 multi-core architecture

Published: 25 April 2006 Publication History

Abstract

The designs of high-performance processor architectures are moving toward the integration of a large number of multiple processing cores on a single chip. The IBM Cyclops-64 (C64) is a petaflop supercomputer built on multi-core system-on-a-chip technology. Each C64 chip employs a multistage pipelined crossbar switch as its on-chip interconnection network to provide high bandwidth and low latency communication between the 160 thread processing cores, the on-chip SRAM memory banks, and other components.
In this paper, we present a study of the architecture and performance of the C64 on-chip interconnection network through simulation. Our experimental results provide observations on the network behavior: (1) Dedicated channels can be created between any output port to input port of the C64 crossbar with latency as low as 7 cycles. The C64 crossbar has the potential reach the full hardware bandwidth, and exhibit a non-blocking behavior; (2) The C64 crossbar is a stable network; (3) The network logic design appears to provide a reasonable opportunity for sharing the channel bandwidth between traffic in either direction; (4) A simple circular neighbor arbitration scheme can achieve competitive performance level comparing to the complex segmented LRU (Least Recently Used) matrix arbitration scheme without losing the fairness. (5) Application-driven benchmarks provide comparable results to synthetic workloads.

References

[1]
Freescale announces industrys first 90nm multicore programmable DSPs in volume production. http://www.physorg.com/news4045.htm.
[2]
L. Barroso, K. Gharacholoo, R. McNamara, A. Nowatzyk, S. Qadeer, B. Sano, S. Smith, R. Stets, and B. Verghese. Piranha: A scalable architecture based on single-chip multiprocessing. In ISCA-27, 2000.
[3]
M. K. Chen, X.-F. Li, R. Lian, J. H. Lin, L. Liu, T. Liu, and R. Ju. Shangri-la: Achieving high performance from compiled network applications while enabling ease of programming. In Proceedings of ACM SIGPLAN 2005 Conference on Programming Language Design and Implementation (PLDI05), Chicago, Illinoi, June 2005.
[4]
J. D. Cuvillo, W. Zhu, Z. Hu, and G. R. Gao. Fast: A functionally accurate simulation toolset for the cyclops64 cellular architecture. In Workshop on Modeling, Benchmarking and simulation (MoBS), Held in conjunction with the 32nd Annual International Symposium on Computer Architecture (ISCA'05), Madison, Wisconsin, June 4 2005.
[5]
W. J. Dally and B. Towels. Principles and Practices of Interconnection Networks. Morgan Kaufmann, 2004.
[6]
J. del Cuvillo, W. Zhu, and G. R. Gao. Landing openmp on cyclops-64: An efficient mapping of openmp to a many-core system-on-a-chip. In ACM International Conference on Computing Frontiers, Ischia, Italy, May 2006.
[7]
J. del Cuvillo, W. Zhu, Z. Hu, and G. R. Gao. Toward a software infrastructure for the cyclops-64 cellular architecture. In 20th International Symposium on High Performance Computing Systems and Applications, St. John's, Newfoundland and Labrador, Canada.
[8]
J. B. del Cuvillo, W. Zhu, Z. Hu, and G. R. Gao. Tiny threads: a thread virtual machine for the cyclops64 cellular architecture. In Proceedings of 5th Workshop on Massively Parallel Processing (WMPP05), in conjuction with the 19th International Parallel and Distributed Processing Symposium (IPDPS2005), Denver, Colorado, April 2005.
[9]
M. Denneau. Computing at the speed of life: The blue gene/cyclops supercomputer. In CITI Distinguished Lecture Series, Rice University, Huston, Texas, September 25 2002.
[10]
M. Franklin. Vlsi performance comparison of banyan crossbar communications networks. In IEEE Trans. on Comp., vol. 30, no. 4, pages 283-291.
[11]
G. Panesar, D. Towner and A. Duller. Deterministic parallel processing. In Proceesings of Micro-grid Workshop on Scalable on-chip Parallelism, Amsterdam, July 2005.
[12]
G. R. Gao, J. del Cuvillo, Z. Hu, R. Klosiwicz, C. Leung, J. McGuiness, H. Sakane, and Y. P. Zhang. Programming Method and Software Infrastructure for Cellular Architecture. Department of Electrical and Computer Engineering, University of Delaware, Newark, Delaware 19716, capsl techincal memo 48 edition, July 2003.
[13]
L. Hammond, B. A. Nayfeh, and K. Olukotun. A single-chip multiprocessor. Computer, 30(9):79-85, 1997.
[14]
R. Kumar, V. Zyuban, and D. M. Tullsen. Interconnections in multi-core architectures: Understanding mechanisms, overheads and scaling. SIGARCH Comput. Archit. News, 33(2):408-419, 2005.
[15]
T. Mudge and B. Makrucki. Probablistic analysis of a crossbar switch. In Proc. of the 9th Ann. Int. Symp. Computer Architecture, pages 311-319, April 1982.
[16]
H. Sakane, L. Yakay, V. Karna, C. Leung, and G. R. Gao. Dimes: An iterative emulation platform for multiprocessor-system-on-chip designs. In IEEE International Conference on Field-Programmable Technology (FPT'03), Tokyo, Japan, December 2003.
[17]
L. Zhang and V. Chaudhary. On the performance of bus interconnection for socs. In Proceedings of 4th Workshop on Media and Stream Processors (in conjunction with IEEE/ACM MICRO-35), Istanbul, Turkey, November 2002.
[18]
Y. P. Zhang. A study of architecture and performance of ibm cyclops64 interconnection network. Master's Thesis, University of Delaware, July 2005.
[19]
W. Zhu, Y. Niu, and G. R. Gao. Performance portability on earth: A case study across several parallel architecture. In Proceedings of 19th IEEE International Parallel and Distributed Processing Symposium (IPDPS'05) - Workshop 15, Apr 2005.

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences
IPDPS'06: Proceedings of the 20th international conference on Parallel and distributed processing
April 2006
399 pages
ISBN:1424400546

Sponsors

  • IEEE CS TCPP: IEEE Computer Society Technical Committee on Parallel Processing

In-Cooperation

Publisher

IEEE Computer Society

United States

Publication History

Published: 25 April 2006

Check for updates

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 31 Dec 2024

Other Metrics

Citations

Cited By

View all

View Options

View options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media