skip to main content
article
Free access

Architecture

Published: 01 November 1998 Publication History
First page of PDF

References

[1]
Abandah, G., and Davidson, E. Effects of architectural and technological advances on the HP/Convex Exemplar's memory and communication performance. In Proceedings of the 25th Annual International Symposium on Computer Architecture (Barcelona, Spain, June 27-july 1 1998), pp. 318-329.]]
[2]
Alverson, G., Briggs, P., Coatney, S., Kahan, S., and Korry, R. Tera hardware-software cooperation. In Proceedings of the ACM\IEEE SC97 (San Jose, Calif., Nov. 15-21). IEEE Computer Society Press, Piscataway, N.J., 1997.]]
[3]
Anderson, T., Culler, D., and Patterson, D. A case for NOW (Networks of Workstations). IEEE Micro 15, 1 (Feb. 1995), 54-56; see also now.cs.berkeley.edu.]]
[4]
ASCI97 (Accelerated Strategic Computing Initiative). www.sandia.gov/ASCI/Red/]]
[5]
Culler, D., Singh, J., and Gupta, A. Parallel Computer Architecture: A Hardware~Software Approach. Morgan Kaufmann Publishers, San Francisco, Calif., 1998.]]
[6]
Kogge, P. Execube: A new architecture for scalable MPPs. In Proceedings of the International Conference on Parallel Processing (St. Charles, Ill., Aug. 1994), pp. 77-84.]]
[7]
Messina, P. The concurrent supercomputing consortium: Year 1. Paral. Distrib. Tech. 1, 1 (Feb. 1993), 9-16.]]
[8]
Patterson, D., Anderson, T., Cardwell, N., Fromm, R., Keeton, K., Kozyrakis, C., Thomas, R., and Yelick, K. A case for Intelligent RAM. IEEE Micro 17, 2 (Apr. 1997).]]
[9]
Scott, S. Synchronization and communication in the T3E multiprocessor. In Proceedings of the 7th International Conference on Architectural Support for Programming Languages and Operating Systems (Cambridge, Mass., Oct. 1996), pp. 26-36.]]
[10]
Seitz, C., and Matisoo, J. Engineering limits on computer performance. Phys. Today 37, 5 (May 1984), 38-45.]]
[11]
Stunkel, C., Shea, D., Abali, B., Atkins, M., Bender, C., Grice, D., Hochschild, P., Joseph, D., Nathanson, B., Swetz, R., Stucke, R., Tsao, M., and Varker, P. The SP2 high-performance switch. IBM Syst.ff. 34, 2 (1995), 185-204.]]
[12]
Tucker, L., and Robertson, G. Architecture and applications of the Connection Machine. Comput. 21, 8 (Aug. 1988), 26-38.]]

Cited By

View all

Recommendations

Reviews

Peter C. Patton

Clearly, the future of high-performance computing lies in building highly parallel computers out of high-volume, low-cost components rather than systems of limited multiplicity using low-volume, high-cost components. This paper is an excellent review of scalable parallel processing (SPP)—past, present, and future—by six of its leading practitioners. They review the history of SPP architectures and their applications in science, and then turn to the limits on SPP performance based on looming constraints on growth in processor, memory, and software performance. If we define the goal of computer architecture as increasing performance by an order of magnitude using system organization features alone, the promise of architecture is still considerable. The architect must exploit improvements in component technology yet design around performance limits. Our experience in applying large commercial parallel systems in online transaction processing with multitier client/server applications indicates that, today, operating system limits to scalability obtain before hardware limits do. The basic issue is preemptive multithreading. For example, the operating system may saturate at 1000 or so threads, and some experiments with Java indicate saturation at 110 to 120 threads per Java Virtual Machine. The authors point out in a sidebar that the Tera MTA-1 hardware offers a solution to this problem. Fortunately, there is at least one software solution as well, namely the CHoPP asynchronous non-preemptive tasking system (ANTs). The trick to gaining parallel performance is hiding latency, but if task switching requires thousands of instruction cycles it eventually becomes self-defeating. Techniques like Tera's hardware to reduce task switching to a couple of instruction cycles or ANTs software to reduce it to 10 to 20 instruction cycles on an Intel or RISC processor may prove to be significant as overall hardware and software architectural techniques.

Access critical reviews of Computing literature here

Become a reviewer for Computing Reviews.

Comments

Information & Contributors

Information

Published In

cover image Communications of the ACM
Communications of the ACM  Volume 41, Issue 11
Nov. 1998
93 pages
ISSN:0001-0782
EISSN:1557-7317
DOI:10.1145/287831
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 November 1998
Published in CACM Volume 41, Issue 11

Permissions

Request permissions for this article.

Check for updates

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)61
  • Downloads (Last 6 weeks)10
Reflects downloads up to 09 Jan 2025

Other Metrics

Citations

Cited By

View all

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Full Access

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media