skip to main content
10.1145/1065895.1065899acmconferencesArticle/Chapter ViewAbstractPublication PagesmspConference Proceedingsconference-collections
Article

Metrics and models for reordering transformations

Published: 08 June 2004 Publication History

Abstract

Irregular applications frequently exhibit poor performance on contemporary computer architectures, in large part because of their inefficient use of the memory hierarchy. Run-time data, and iteration-reordering transformations have been shown to improve the locality and therefore the performance of irregular benchmarks. This paper describes models for determining which combination of run-time data- and iteration-reordering heuristics will result in the best performance for a given dataset. We propose that the data- and iteration-reordering transformations be viewed as approximating minimal linear arrangements on two separate hypergraphs: a spatial locality hypergraph and a temporal locality hypergraph. Our results measure the efficacy of locality metrics based on these hypergraphs in guiding the selection of data-and iteration-reordering heuristics. We also introduce new iteration- and data-reordering heuristics based on the hypergraph models that result in better performance than do previous heuristics.

References

[1]
M. F. Adams. Finite element market. http://www.cs.berkeley.edu/~madams/femarket/index.html.]]
[2]
I. Al-Furaih and S. Ranka. Memory hierarchy management for iterative graph structures. In Proceedings of the 1st Merged International Parallel Processing Symposium and Symposium on Parallel and Distributed Processing, pages 298--302, March 30-April 3, 1998.]]
[3]
B. Brooks, R. Bruccoleri, B. Olafson, D. States, S. Swaminathan, and M. Karplus. CHARMM: A program. for macromolecular energy, minimization and dynamics calculations. Journal of Computational Chemistry, 187(4), 1983.]]
[4]
A. C. Calder, B. C. Curtis, L. J. Dursi, B. Fryxell, G. Henry, P. MacNeice, K. Olson, P. Ricker, R. Rosner, F. X. Timmes, H. M. Tufo, J. W. Truran, and M. Zingale. High-performance reactive fluid flow simulations using adaptive mesh refinement on thousands of processors. In Proceedings of SC2000, 2000.]]
[5]
P. Carloni. PDB coordinates for HIV-1 Nef binding to Thioesterase II. http://www.sissa.it/sbp/bc/-publications/publications.html.]]
[6]
U. Catalyurek. Partitioning tools for hypergraph. http://www.cs.umd.edu/~umit/software.htm.]]
[7]
COSMIC group, University of Maryland. COSMIC software for irregular applications. http://-www.cs.umd.edu/projects/cosmic/software.html.]]
[8]
CUBIT Development Team. CUBIT mesh generation environment volume 1: Users manual.]]
[9]
E. Cuthill and J. McKee. Reducing the bandwidth of sparse symmetric matrices. In Proceedings of the 24th National Conference ACM, pages 157--172, 1969.]]
[10]
R. Das, D. Mavriplis, J. Saltz, S. Gupta, and R. Ponnusamy. The design and implementation of a parallel unstructured euler solver using software primitives. AIAA Journal, 32:489--496, March 1992.]]
[11]
T. Davis. University of Florida sparse matrix collection. http://www.cise.ufl.edu/research/sparse/matrices/.]]
[12]
C. Ding and K. Kennedy. Improving cache performance in dynamic applications through data and computation reorganization at run time. In Proceedings of the 1999 ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), pages 229--241, May 1-4, 1999.]]
[13]
C. Ding and K. Kennedy. Inter-array data regrouping. In Proceedings of the 12th International Workshop on Languages and Compilers for Parallel Computing, volume 1863 of Lecture Notes in Computer Science. Springer-Verlag, August 1999.]]
[14]
C. Ding and Y. Zhong. Predicting whole-program locality through reuse distance analysis. In Proceedings of the 2003 ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), June 2003.]]
[15]
J. Fu, A. Pothen, D. Mavriplis, and S. Ye. On the memory system performance of sparse algorithms. In Eighth International Workshop on Solving Irregularly Structured Problems in Parallel, 2001.]]
[16]
M. R. Garey, D. S. Johnson, and L. Stockmeyer. Some simplified NP-complete graph problems. Theoretical Computer Science, 1:237--267, 1976.]]
[17]
H. Han and C. Tseng. A comparison of locality transformations for irregular codes. In Proceedings of the 5th International Workshop on Languages, Compilers, and Run-time Systems for Scalable Computers, volume 1915 of Lecture Notes in Computer Science. Springer, 2000.]]
[18]
H. Han and C.-W. Tseng. Locality optimizations for adaptive irregular scientific codes. Technical report, Department of Computer Science, University of Maryland, November 2000.]]
[19]
L. H. Harper. Optimal assignments of numbers to vertices. SIAM Journal, 12(1):131--135, 1964.]]
[20]
H. Heller. PDB coordinates for bacteriorhodopsin in a POPC membrane. http://www.lrzmuenchen.de/~heller/membrane/membrane.html.]]
[21]
H. Heller, M. Schaefer, and K. Schulten. Molecular dynamics simulation of a bilayer of 200 lipids in the gel and in the liquid crystal phases. J. Phys. Chem., 97:8343--8360, 1993.]]
[22]
E. Im and K. Yelick. Optimizing sparse matrix computations for register reuse in sparsity. In V. N. Alexandrov, J. J. Dongarra, B. A. Juliano, R. S. Renner, and C. J. K. Tan, editors, Computational Science - ICCS 2001, volume 2073 of Lecture Notes in Computer Science, pages 127--136. Springer, May 28-30, 2001.]]
[23]
K. London, J. Dongarra, S. Moore, P. Mucci, K. Seymour, and T. Spencer. End-user tools for application performance analysis using hardware counters. In International Conference on Parallel and Distributed Computing Systems, August 2001.]]
[24]
J. Mellor-Crummey, D. Whalley, and K. Kennedy. Improving memory hierarchy performance for irregular applications. In Proceedings of the 1999 ACM SIGARCH International Conference on Supercomputing (ICS), pages 425--433, June 20-25 1999.]]
[25]
J. Mellor-Crummey, D. Whalley, and K. Kennedy. Improving memory hierarchy performance for irregular applications using data and computation reorderings. International Journal of Parallel Programming, 29(3):217--247, June 2001.]]
[26]
R. Mirchandaney, J. H. Saltz, R. M. Smith, D. M. Nicol, and K. Crowley. Principles of runtime support for parallel processors. In Proceedings of the 1988 ACM International Conference on Supercomputing (ICS), pages 140--152, July 1988.]]
[27]
N. Mitchell, L. Carter, and J. Ferrante. Localizing non-affine array references. In Proceedings of the 1999 International Conference on Parallel Architectures and Compilation Techniques, pages 192--202, October 12-16, 1999.]]
[28]
N. Mitchell, L. Carter, and J. Ferrante. A modal model of memory. In V. N. Alexandrov, J. J. Dongarra, B. A. Juliano, R. S. Renner, and C. J. K. Tan, editors, Computational Science - ICCS 2001, volume 2073 of Lecture Notes in Computer Science. Springer, May 28--30, 2001.]]
[29]
T. Munson. Mesh shape-quality optimization using the inverse mean-ratio metric. Technical Report ANL/MCS-P1136-0304, Mathematics and Computer Science Division, Argonne National Laboratory, 2004.]]
[30]
C.-W. Ou, M. Gunwani, and S. Ranka. Architecture-independent locality-improving transformations of computational graphs embedded in k-dimensions. In Proceedings of the International Conference on Supercomputing, 1994.]]
[31]
L. Rauchwerger. Run-time parallelization: Its time has come. Parallel Computing, 24(3-4):527--556, 1998.]]
[32]
J. R. Shewchuk, Triangle: Engineering a 2D Quality Mesh Generator and Delaunay Triangulator. In M. C. Lin and D. Manocha, editors, Applied Computational Geometry: Towards Geometric Engineering, volume 1148 of Lecture Notes in Computer Science, pages 203--222. Springer-Verlag, May 1996. From the First ACM Workshop on Applied Computational Geometry.]]
[33]
J. P. Singh, C. Holt, T. Totsuka, A. Gupta, and J. Hennessy. Load balancing and data locality in adaptive hierarchical N-body methods: Barnes-Hut, fast multipole, and radiosity. Journal of Parallel and Distributed Computing, 27(2):118--141, June 1995.]]
[34]
M. M. Strout, L. Carter, and J. Ferrante. Compile-time composition of run-time data and iteration reorderings. In Proceedings of the 2003 ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), June 2003.]]
[35]
Theoretical and Computational Biophysics Group, University of Illinois. ER-GRE and APoA1 datasets. http://www.ks.uiuc.edu/Research/namd/utilities/.]]
[36]
H. Yu, F. Dang, and L. Rauchwerger. Parallel reductions: An application of adaptive algorithm selection. In Proceedings of the 15th Workshop on Languages and Compilers for Parallel Computing (LCPC), July 2002.]]

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
MSP '04: Proceedings of the 2004 workshop on Memory system performance
June 2004
70 pages
ISBN:1581139411
DOI:10.1145/1065895
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 08 June 2004

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. data locality
  2. inspector/executor
  3. locality metrics
  4. optimization
  5. run-time reordering transformations
  6. spatial locality graph
  7. temporal locality hypergraph

Qualifiers

  • Article

Acceptance Rates

Overall Acceptance Rate 6 of 20 submissions, 30%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)7
  • Downloads (Last 6 weeks)1
Reflects downloads up to 15 Jan 2025

Other Metrics

Citations

Cited By

View all

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media