skip to main content
10.1145/977091.977116acmconferencesArticle/Chapter ViewAbstractPublication PagescfConference Proceedingsconference-collections
Article

Fighting the memory wall with assisted execution

Published: 14 April 2004 Publication History

Abstract

Assisted execution is a form of simultaneous multithreading in which a set of auxiliary "assistant" threads, called nanothreads, is attached to each thread of an application. Nanothreads are lightweight threads which run on the same processor as the main (application) thread and help execute the main thread as fast as possible. Nanothreads exploit resources that are idled in the processor because of hazards due to program dependencies and memory access delays.Assisted execution has the potential to alter the current trade-offs between static and dynamic execution mechanisms. Nanothreads can monitor and reconfigure the underlying hardware, can emulate hardware and can profile applications with little or no interference to improve the program on-line or off-line.We demonstrate the power of assisted execution with an important application, namely data prefetching to fight the memory wall problem. Simulation results on several SPEC95 benchmarks show that sequential and stride prefetching implemented with nanothreads performs just as well as ideal hardware prefetchers.

References

[1]
Mark Horowitz, Margaret Martonosi, Todd C. Mowry, and Michael D. Smith, "Informing Memory Operations: Providing Memory Performance Feedback in Modern Processors," Proceedings of the 23rd Annual International Symposium on Computer Architecture, pp. 260--270, May 1996.
[2]
Fredrik Dahlgren and Per Stenström, "Evaluation of Hardware-Based Stride and Sequential Prefetching in Shared-Memory Multiprocessors," IEEE Transactions on Parallel and Distributed Systems, Vol. 7, No. 4, pp. 385--398, April 1996.
[3]
Parthasarathy Ranganathan, Vijay S. Pai, Hazim Abdel-Shafi, and Sarita V. Adve. "The Interaction of Software Prefetching with ILP Processors in Shared-Memory System," Proceedings of the 24th Annual International Symposium on Computer Architecture, June 1997.
[4]
Jonas Skeppstedt and Michel Dubois, "Hybrid Compiler/Hardware Prefetching for Multiprocessors Using Low-Overhead Cache Miss Traps," Proceedings of the International Conference on Parallel Processing, pp. 298--305, August 1997.
[5]
Dean M. Tullsen, Susan J. Eggers, and Henry M. Levy, "Simultaneous Multithreading: Maximizing On-Chip Parallelism," Proceedings of the 22rd Annual International Symposium on Computer Architecture, pp. 392--403, June 1995.
[6]
Susan J. Eggers, Joel S. Emer, Henry M. Levy, Jack L. Lo, Rebecca L. Stamm, and Dean M. Tullsen, "Simultaneous Multithreading: A Platform for Next-generation Processors," IEEE Micro, pp. 12--18, September/October 1997.
[7]
MIPS Technologies Inc., "R10000 Microprocessor User's Manual-Version 2.0," December 1996.
[8]
Mats Brorsson, Fredrik Dahlgren, Håkan Nilsson, and Per Stenström, "The CacheMire Test Bench -- A Flexible and Effective Approach for Simulation of Multiprocessors," Proceedings of 26th Annual Simulation Symposium, pp. 41--49, March 1993.
[9]
D. Kroft,"Lockup-free Instruction Fetch/Prefetch Cache Organization," Proceedings. of the 8th International Symposium on Computer Architecture, pp. 81--87, May 1991.
[10]
The SPEC Corporation, The SPEC95 Benchmark Suite, 1995.
[11]
Xiaogang Qiu and Michel Dubois, "Tolerating Late Memory Traps for ILP Processors", In Proceedings of the 26th Annual International Symposium on Computer Architecture(ISCA), pp. 76--87, 1999.
[12]
Yong Ho Song and Michel Dubois,"Assisted Execution", Technical Report #CENG 98-25, Department of EE-Systems, University of Southern California, October 1998.
[13]
Tor M. Aamodt, Paul Chow, Per Hammarlund, Hong Wang, and John P. Shen, Hardware Support for Prescient Instruction Prefetch, Proceedings of the 10th Conference on High-Performance Computer Architecture, 2004.
[14]
C. Zilles and G. Sohi,"Execution-based prediction using speculative slices," Proceedings of the 28th Annual International Symposium on Computer Architecture, 2001.
[15]
C.-K. Luk, "Tolerating Memory Latency through Software-Controlled Pre-Execution in Simultaneous Multithreading Processors," Proceedings of the 28th Annual International Symposium on Computer Architecture, 2001.
[16]
Robert S. Chappell, Francis Tseng, Yale N. Patt, Adi Yoaz,"Difficult-Path Branch Prediction Using Subordinate Microthreads," Proceedings of the 29th Annual International Symposium on Computer Architecture, 2002.
[17]
Craig Zilles, Joel Emer, and Gurindar Sohi, "The Use of Multithreading for Exception Handling,", Proceedings of the 32nd Annual International Symposium on Microarchitecture(Micro-32), 1999.
[18]
J. Collins, H. Wang, D. Tullsen, C. Hughes, Y.-F. Lee, D.Lavery, and J. Shen, "Speculative precomputation: Long-range prefetching of delinquent loads," Proceedings of the 28th Annual International Symposium on Computer Architecture, 2001.

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
CF '04: Proceedings of the 1st conference on Computing frontiers
April 2004
522 pages
ISBN:1581137419
DOI:10.1145/977091
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 14 April 2004

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. cache memories
  2. latency tolerance
  3. prefetching
  4. simultaneous multithreading
  5. superscalar processors

Qualifiers

  • Article

Conference

CF04
Sponsor:
CF04: Computing Frontiers Conference
April 14 - 16, 2004
Ischia, Italy

Acceptance Rates

Overall Acceptance Rate 273 of 785 submissions, 35%

Upcoming Conference

CF '25

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)14
  • Downloads (Last 6 weeks)1
Reflects downloads up to 01 Feb 2025

Other Metrics

Citations

Cited By

View all

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media