skip to main content
10.5555/110382.110597acmconferencesArticle/Chapter ViewAbstractPublication PagesscConference Proceedingsconference-collections
Article
Free access

Scan primitives for vector computers

Published: 01 October 1990 Publication History

Abstract

This paper describes an optimized implementation of a set of scan (also called all-prefix-sums) primitives on a single processor of a CRAY Y-MP, and demonstrates that their use leads to greatly improved performance for several applications that cannot be vectorized with existing compiler technology. The algorithm used to implement the scans is based on an algorithm for parallel computers and is applicable with minor modifications to any register-based vector computer. On the CRAY Y-MP, the asymptotic running time of the plus-scan is about 2.25 times that of a vector add, and is within 20% of optimal. An important aspect of our implementation is that a set of segmented versions of these scans are only marginally more expensive than the unsegmented versions. These segmented versions can be used to execute a scan on multiple data sets without having to pay the vector startup cost (n 1/2) for each set.
The paper describes a radix sorting routine based on the scans that is 13 times faster than a Fortran version and within 20% of a highly optimized library sort routine, three operations on trees that are between 10 and 20 times faster than the corresponding C versions, and a connectionist learning algorithm that is 10 times faster than the corresponding C version for sparse and irregular networks.

References

[1]
American National Standards Institute. American National Standard for Information Systems Programming Language Fortran: S8(X3.9-198x), March 1989.
[2]
Guy E. Blelloch. Scans as Primitive Parallel Operations. IEEE Transactions on Computers, C-38(11): 1526-1538, November 1989.
[3]
Guy E. Blelloch. Vector Models for Data-Parallel Computing. MIT Press, Cambridge, MA, 1990.
[4]
Guy E. Blelloch and James J. Litde. Parallel Solutions to Geometric Problems on the Scan Model of Computation. In Proceedings International Conference on Parallel Processing, pages Vol 3:218-222, August 1988.
[5]
Guy E. Blelloch and Gary W. Sabot. Compiling Collection- Oriented Languages onto Massively Parallel Computers. Journal of Parallel and Distributed Computing, 8(2), February 1990.
[6]
Cray Research Inc., Mendota Heights, Minnesota. Symbolic Machine Instructions Reference Manual SR-OO85B, March 1988.
[7]
W. Daniel Hillis and Guy L. Steele Jr. Data Parallel AI- gorlthrns. Communications of the ACM, 29(12), December 1986.
[8]
R. W. Hockney. A Fast Direct Solution of Poisson's Equation Using Fourier Analysis. Journal of the Association for Computing Machinery, 12(i ):95-113, January 1965.
[9]
R. W. Hockney and C. R. Jesshope. Parallel Computers: Architecture, Programming, and Algorithms. A. Hilger, Philadelphia, PA, Second Edition, 1988.
[10]
Kenneth E. Iverson. A Programming Language. Wiley, New York, 1962.
[11]
D. E. Knuth. Sorting and Searching. Addison-Wesley, Reading, MA, 1973.
[12]
Peter M. Kogge and Harold S. Stone. A Para/lel Algorithrn for the Efficient Solution of a General Class of Recurrence Equations. IEEE Transactions on Computers, C- 22(8):786-793, August 1973.
[13]
Richard E. Ladner and Michael J. Fischer. Parallel Prefix Computation. Journal of the Association for Computing Machinery, 27(4):831-838, October 1980.
[14]
C. L. Lawson, R. J. Hanson, D. R. K.incaid, and F. T. Krogh. Basic Linear Algebra Subprograms for Fortran Usage. ACM Transactions on Mathematical Software, 5(3):308-323, September 1979.
[15]
John M. Levesque and Joel W. Williamson. A Guidebook to Fortran on Supercomputers. Academic Press, Inc., San Diego, CA, 1989.
[16]
James J. Little, Guy E. Blelloch, and Todd A. Cass. Algorithmic Techniques for Computer Vision on a Fine-Grained Parallel Machine. IEEE Transactions on Pattern Analysis and Machine Intelligence, 11 (3):244--257, March 1989.
[17]
Yu. Ofman. On the Algorithmic Complexity of Discrete Functions. Soviet Physics Doklady, 7(7):589-591, January 1963.
[18]
Constantine D. Polychronopoulos. Parallel Programming and Compilers. Kluwer Academic Publishers, Norwell, MA, 1988.
[19]
Harold S. Stone. Parallel Processsing with the Perfect Shuffle. IEEE Transactions on Computers, C-20(2):153-161, 1971.
[20]
Harold S. Stone. Parallel Tridiagonal Equation Solvers. ACM Transactions on Mathematical Software, 1(4):289- 307, December 1975.
[21]
Y. Tanaka, K. Iwasawa, S. Gotoo, and Y. Umetart/. Compiling Techniques for First-Order Linear Recurrences on a Vector Computer. In Proceedings Supercomputing '88, pages 174--181, Orlando, Florida, November 1988.
[22]
Hideo Wada, Koichi Ishiii, Masakazu Fukagawa, Hiroshi Murayama, and Shun Kawabe. High-speed Processing Schemes for Summation Type and Iteration Type Vector Instructions on HITACHI Supercomputer S-820 System. In Proceedings 1988 ACM Conference on Supercomputing, pages 197-206, July 1988.
[23]
Skef Wholey and Guy L. Steele Jr. Connection Machine Lisp: A Dialect of Common Lisp for DataParaUel Programming. In Proceedings Second International Conference on Supercomputing, May 1987.

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
Supercomputing '90: Proceedings of the 1990 ACM/IEEE conference on Supercomputing
November 1990
982 pages
ISBN:0897914120

Sponsors

Publisher

IEEE Computer Society Press

Washington, DC, United States

Publication History

Published: 01 October 1990

Check for updates

Qualifiers

  • Article

Conference

SC '90
Sponsor:

Acceptance Rates

Overall Acceptance Rate 1,516 of 6,373 submissions, 24%

Upcoming Conference

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)36
  • Downloads (Last 6 weeks)5
Reflects downloads up to 29 Jan 2025

Other Metrics

Citations

Cited By

View all

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media