skip to main content
research-article

The warp computer: Architecture, implementation, and performance

Published: 01 December 1987 Publication History

Abstract

The Warp machine is a systolic array computer of linearly connected cells, each of which is a programmable processor capable of performing 10 million floating-point operations per second (10 MFLOPS). A typical Warp array includes ten cells, thus having a peak computation rate of 100 MFLOPS. The Warp array can be extended to include more cells to accommodate applications capable of using the increased computational bandwidth. Warp is integrated as an attached processor into a Unix host system. Programs for Warp are written in a high-level language supported by an optimizing compiler. The first ten-cell prototype was completed in February 1986; delivery of production machines started in April 1987. Extensive experimentation with both the prototype and production machines has demonstrated that the Warp architecture is effective in the application domain of robot navigation as well as in other fields such as signal processing, scientific computation, and computer vision research. For these applications, Warp is typically several hundred times faster than a VAX 11/780 class computer. This paper describes the architecture, implementation, and performance of the Warp machine. Each major architectural decision is discussed and evaluated with system, software, and application considerations. The programming model and tools developed for the machine are also described. The paper concludes with performance data for a large number of applications.

References

[1]
M. Annaratone, E. Arnould, T. Gross, H. T. Kung, M. S. Lam, O. Menzilcioglu, K. Sarocky, and J. A. Webb, "Warp architecture and implementation," in Proc. 13th Annu. Int. Symp. Comput. Architecture, IEEE/ACM, June, 1986, pp. 346-356.
[2]
M. Annaratone, E. Arnould, H. T. Kung, and O. Menzilcioglu, "Using Warp as a supercomputer in signal processing," in Proc. ICASSP 86, Apr. 1986, pp. 2895-2898.
[3]
M. Annaratone, F. Bitz, E. Clune, H. T. Kung, P. Maulik, H. Ribas, P. Tseng, and J. Webb, "Applications and algorithm partitioning on Warp," in Proc. Compcon Spring 87, San Francisco, CA, Feb., 1987, pp. 272-275.
[4]
M. Annaratone, F. Bitz, J. Deutch, L. Harney, H. T. Kung, P. C. Maulik, P. Tseng, and J. A. Webb, "Applications experience on Warp," in Proc. 1987 Nat. Comput. Conf., AFIPS, Chicago, IL, June 1987, pp. 149-158.
[5]
M. Annaratone, E. Arnould, R. Cohn, T. Gross, H. T. Kung, M. Lam, O. Menzilcioglu, K. Sarocky, J. Senko, and J. Wehb, "Architecture of Warp," in Proc. Compcon Spring 87, San Francisco, CA, Feb. 1987, pp. 274-267.
[6]
M. Annaratone, E. Arnould, R. Cohn, T. Gross, H. T. Kung, M. Lam, O. Menzilcioglu, K. Sarocky, J. Senko, and J. Wehb, "Warp architecture: From prototype to production," in Proc. 1987 Nat. Comput. Conf., AFIPS, Chicago, IL, June, 1987, pp. 133- 140.
[7]
K. E. Batcher, "Design of a massively parallel processor," IEEE Trans. Comput., vol. C-29, pp. 836-840, 1980.
[8]
B. Bruegge, C. Chang, R. Cohn, T. Gross, M. Lam, P. Lieu, A. Noaman, and D. Yam, "The Warp programming environment," in Proc. 1987 Nat. Comput. Conf., AFIPS, Chicago, IL, June 1987, pp. 141-148.
[9]
A. E. Charlesworth, "An approach to scientific array processing: The architectural design of the AP-120B/FPS-164 family," Computer, vol. 14, pp. 18-27, Sept. 1981.
[10]
E. Clune, J. D. Crisman, G. J. Klinker, and J. A. Webb, "Implementation and performance of a complex vision system on a systolic array machine," in Proc. Conf. Frontiers Comput., Amsterdam, Dec. 1987.
[11]
A. L. Fisher, H. T. Kung, and K. Sarocky, "Experience with the CMU programmable systolic chip," Microarchitecture VLSI Comput., pp. 209-222, 1985.
[12]
T. Gross and M. Lam, "Compilation for a high-performance systolic array," in Proc. SIGPLAN 86 Symp. Compiler Construction, ACM SIGPLAN, June, 1986, pp. 27-38.
[13]
T. Gross, H. T. Kung, M. Lam, and J. Webb, "Warp as a machine for low-level vision," in Proc. 1985 IEEE Int. Conf. Robot. Automat., Mar. 1985, pp. 790-800.
[14]
L. G. C. Hamey, J. A. Webb, and I. C. Wu, "Low-level vision on warp and the apply programming model," in Parallel Computation and Computers for Artificial Intelligence, J. Kowalik, Ed. Hingham, MA: Kluwer Academic, 1987.
[15]
R. M. Haralick, "Digital step edges from zero crossings of second directional derivatives," IEEE Trans. Pattern Anal. Machine Intell., vol. PAMI-6, pp. 58-68, 1984.
[16]
F. H. Hsu, H. T. Kung, T. Nishizawa, and A. Sussman, "Architecture of the link and interconnection chip," in Proc. 1985 Chapel Hill Conf., VLSI, Comput. Sci., Dep., Univ. North Carolina, May, 1985, pp. 186-195.
[17]
T. Kanade and J. A. Webb, "End of year report for parallel vision algorithm design and implementation," Tech. Rep. CMU-R1 TR-87- 15 Robot. Instit., Carnegie Mellon Univ., 1987.
[18]
H. T. Kung, "Why systolic architectures?," Computer, vol. 15, pp. 37-46, Jan. 1982.
[19]
H. T. Kung, "Systolic algorithms for the CMU Warp processor," in Proc. Seventh Int. Conf. Pattern Recognition, Int. Ass. Pattern Recognition, 1984, pp. 570-577.
[20]
H. T. Kung, "Memory requirements for balanced computer architectures," J. Complexity, vol. 1, pp. 147-157, 1985.
[21]
H. T. Kung and J. A. Webb, "Global operations on the CMU warp machine," in Proc. 1985 AIAA Comput. Aerosp. V Conf., Amer. Instit. Aeronaut. Astronaut., Oct., 1985, pp. 209-218.
[22]
H. T. Kung and J. A. Webb, "Mapping image processing operations onto a linear systolic machine," Distributed Comput., vol. 1, pp. 246-257, 1986.
[23]
M. S. Lam, "A systolic array optimizing compiler," Ph.D. dissertation, Carnegie Mellon Univ., May 1987.
[24]
C. Lasser, The Complete *Lisp Manual, Thinking Machines Corp., Cambridge, MA, 1986.
[25]
J. J. Little, G. Gelloch, and T. Cass, "Parallel algorithms for computer vision on the connection machine," in Proc. Image Understanding Workshop, DARPA, Feb., 1987, pp. 628-638.
[26]
F. P. Preparata and M. I. Shamos, Computational Geometry--An Introduction. New York: Springer-Verlag, 1985.
[27]
B. R. Rau and C. D. Glaeser, "Some scheduling techniques and an easily schedulable horizontal architecture for high performance scientific computing," in Proc. 14th Annu. Workshop Microprogramming, Oct., 1981, pp. 183-198.
[28]
B. R. Rau, P. J. Kuekes, and C. D. Glaeser, "A statistically scheduled VLSI interconnect for parallel processors," VLSI Syst. Comput., Oct. 1981, pp. 389-395.
[29]
A. Rosenfeld, "A report on the DARPA image understanding architectures workshop," in Proc. Image Understanding Workshop, DARPA, Los Angeles, CA, Feb., 1987, pp. 298-301.
[30]
H. Tamura, S. Sakane, F. Tomita, N. Yokoya, K. Sakaue, and N. Kaneko, SPIDER Users' Manual, Joint System Development Corp., Tokyo, Japan, 1983.
[31]
Thinking Machines Corp., Connection Machine Model CM-2 Technical Summary HA 87-4, Thinking Machines Corp., Apr. 1897.
[32]
R. Wallace, A. Stentz, C. Thorpe, W. Whittaker, and T. Kanade, "First results in robot road-following," in Proc. IJCAI, 1985, pp. 1089-1093.
[33]
R. Wallace, K. Matsuzaki, Y. Goto, J. Crisman, J. Webb, and T. Kanade, "Progress in robot road-following," in Proc. 1986 IEEE Int. Conf. Robot. Automat., Apr., 1986, pp. 1615-1621.
[34]
D. L. Waltz, "Applications of the connection machine," Computer, vol. 20, pp. 85-97, Jan. 1987.
[35]
B. Woo, L. Lin, and F. Ware, "A high-speed 32 bit IEEE floating-point chip set for digital signal processing," in Proc. ICASSP 84, IEEE, 1984, pp. 16.6.1-16.6.4.
[36]
D. Young, Iterative Solution of Large Linear Systems. New York: Academic, 1971.

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image IEEE Transactions on Computers
IEEE Transactions on Computers  Volume 36, Issue 12
Dec. 1987
145 pages

Publisher

IEEE Computer Society

United States

Publication History

Published: 01 December 1987

Author Tags

  1. Computer system implementation
  2. computer vision
  3. image processing
  4. optimizing compiler
  5. parallel processors
  6. performance evaluation
  7. pipelined processor
  8. scientific computing
  9. signal processing
  10. systolic array
  11. vision research

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 26 Jan 2025

Other Metrics

Citations

Cited By

View all

View Options

View options

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media