skip to main content
research-article

Maintaining Cache Coherence through Compiler-Directed Data Prefetching

Published: 15 September 1998 Publication History

Abstract

In this paper, we propose a compiler-directed cache coherence scheme which makes use of data prefetching to enforce cache coherence in large-scale distributed shared-memory (DSM) systems. TheCache Coherence With Data Prefetching(CCDP) scheme uses compiler analyses to identify potentially stale and nonstale data references in a parallel program and enforces cache coherence by prefetching the potentially stale references. In this manner, the CCDP scheme brings up-to-date data into the caches to avoid stale references and also hides the latency of these memory accesses. Furthermore, the scheme also prefetches the nonstale references to hide their memory latencies. To evaluate the performance impact of the CCDP scheme on a real system, we applied the scheme on five applications from the SPEC CFP95 and CFP92 benchmark suites, and executed the resulting codes on the Cray T3D. The experimental results indicate that for all of the applications studied, our scheme provides significant performance improvements by caching shared data and using data prefetching to enforce cache coherence and to hide memory latency.

References

[1]
R. Arpaci, D. Culler, A. Krishnammurthy, S. Steinberg, K. Yelick, Emperical evaluation of the Cray T3D: A compiler perspective, June 1995.
[2]
D. Bernstein, D. Cohen, A. Freund, D. Maydan, Compiler techniques for data prefetching on the PowerPC, June 1995.
[3]
T.-F. Chen, University of Washington, Seattle, July 1993.
[4]
H. Cheong, A. Veidenbaum, Compiler-directed cache management in multiprocessors, IEEE Computer, 23 (1990) 39-47.
[5]
L. Choi, University of Illinois, Urbana-ChampaignCenter for Supercomputing R & D, March 1996.
[6]
L. Choi, H.-B. Lim, P.-C. Yew, Techniques for compiler-directed cache coherence, IEEE Parallel Distributed Technol. Winter 1996, 23, 34
[7]
, March 1993.
[8]
, June 1994.
[9]
, May 1994.
[10]
F. Dahlgren, P. Stenstrom, Evaluation of hardware-based stride and sequential prefetching in shared-memory multiprocessors, IEEE Trans. Parallel Distributed Syst., 7 (1996) 385-398.
[11]
J.W.C. Fu, J. Patel, Data prefetching in multiprocessor vector cache memories, May 1991.
[12]
D. Gannon, W. Jalby, K. Gallivan, Strategies for cache and local memory management by global program transformation, J. Parallel Distributed Comput., 5 (1988) 587-616.
[13]
E. Gornish, University of Illinois, Urbana-ChampaignCenter for Supercomputing R & D, December 1989.
[14]
E. Gornish, E. Granston, A. Veidenbaum, Compiler-directed data prefetching in multiprocessors with memory hierarchies, June 1990.
[15]
N. Jouppi, Improving direct-mapped cache performance by the addition of a small fully-associative cache and prefetch buffers, May 1990.
[16]
V. Karamcheti, A. Chien, A comparison of architectural support for messaging in the TMC CM-5 and the Cray T3D, June 1995.
[17]
[18]
D. Kuck, The Cedar system and an initial performance study, May 1993.
[19]
R.L. Lee, P.-C. Yew, D. Lawrie, Data prefetching in shared memory multiprocessors, August 1987.
[20]
D. Lenoski, J. Laudon, K. Gharachorloo, A. Gupta, J. Hennessy, The directory-based cache coherence protocol for the DASH multiprocessor, May 1990.
[21]
H.-B. Lim, P.-C. Yew, A compiler-directed chache coherence scheme using data prefetching, April 1997.
[22]
C.-K. Luk, T. Mowry, Compiler-based prefetching for recursive data structures, Ocotber 1996.
[23]
T. Mowry, Stanford UniversityDept. of Electrical Engineering, March 1994.
[24]
T. Mowry, A. Gupta, Tolerating latency through software-controlled prefetching in shared-memory multiprocessors, J. Parallel Distributed Comput., 12 (1991) 87-106.
[25]
T. Mowry, M.S. Lam, A. Gupta, Design and evaluation of a compiler algorithm for prefetching, October 1992.
[26]
R. Numrich, Tech. report (August 1994).
[27]
D.A. Padua, R. Eigenmann, J. Hoeflinger, P. Peterson, P. Tu, S. Weatherford, K. Faigin, CSRD Tech. Report (June 1993).
[28]
M. Papamarcos, J. Patel, A low-overhead coherence solution for multiprocessors with private cache memories, June 1984.
[29]
A. Porterfield, Rice University, May 1989.
[30]
V. Santhanam, E. Gornish, W.-C. Hsu, Data prefetching on the HP PA-8000, June 1997.
[31]
M. Wolf, Stanford UniversityDept. of Computer Science, August 1992.

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image Journal of Parallel and Distributed Computing
Journal of Parallel and Distributed Computing  Volume 53, Issue 2
Sept. 15, 1998
84 pages
ISSN:0743-7315
  • Editors:
  • Allan Gottlieb,
  • Kai Hwang,
  • Sartaj Sahni
Issue’s Table of Contents

Publisher

Academic Press, Inc.

United States

Publication History

Published: 15 September 1998

Author Tags

  1. Compiler-directed cache coherence
  2. compiler
  3. data prefetching
  4. memory system
  5. shared-memory multiprocessors

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 07 Nov 2024

Other Metrics

Citations

Cited By

View all

View Options

View options

Get Access

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media