skip to main content
10.1145/1242531.1242568acmconferencesArticle/Chapter ViewAbstractPublication PagescfConference Proceedingsconference-collections
Article

Speculative supplier identification for reducing power of interconnects in snoopy cache coherence protocols

Published: 07 May 2007 Publication History

Abstract

In this work we reduce interconnect power dissipation in Symmetric Multiprocessors or SMPs. We revisit snoopy cache coherence protocols and reduce unnecessary interconnect activity by speculating nodes expected to provide a missing data. Conventional snoopy cache coherence protocols broadcast requests to all nodes, reducing the latency of cache to cache transfer misses at the expense of increasing interconnect power. We show that it is possible to reduce the associated power dissipation if such requests are broadcasted selectively and only to nodes more likely to provide the missing data. We reduce power as we limit access only to the interconnect components between the requester and the supplier node. We evaluate our technique using shared memory applications and show that it is possible to reduce interconnect power by 21% in a 4-way multiprocessor without compromising performance. This comes with negligible hardware overhead.

References

[1]
S. Mukherjee et al., The Alpha 21364 network Architecture, IEEE Micro, Volume 22 Issue 1, pp.26--35, 2002.
[2]
Alan E. Charlesworth. The Sun Fireplane System Interconnect, In Proceedings of the 2001 ACM/IEEE conference on Supercomputing, 2001.
[3]
M. E. Acacio, J. Gonzalez, J. M. Garcia, and J. Duato, The Use of Prediction for Accelerating Upgrade Misses in CCNUMA Multiprocessors, In Proceedings of PACT-11, 2002.
[4]
S. C. Woo, M. Ohara, E. Torrie, J. P. Singh, and A. Gupta. The SPLASH-2 Programs: Characterization and Methodological Considerations. In International Symposium on Computer Architecture, June 1995.
[5]
Robert C. Steinke, Gary J. Nutt, A unified theory of shared memory consistency, Journal of the ACM (JACM), v.51 n.5, p.800--849, September 2004.
[6]
C. Saldanha and M. H. Lipasti, Power Efficient Cache Coherence, High Performance Memory Systems, Springer-Verlag, 2003.
[7]
J. Renau, B. Fraguela, J. Tuck, W. Liu, M. Prvulovic, L. Ceze, K. Strauss, S. Sarangi, P. Sack, and P. Montesinos. SESC Simulator, Jan 2005. http://sesc.sourceforge.net.
[8]
E. E. Bilir, R. M. Dickson, Y. Hu, M. Plakal, D. J. Sorin, M. D. Hill, and D. A. Wood, Multicast Snooping: A New Coherence Method using a Multicast Address Network, SIGARCH Comput. Architure News, pp. 294--304, 1999.
[9]
Shubhendu S. Mukherjee and Mark D. Hill, Using prediction to accelerate coherence protocols, In Proceedings of the 25th Annual International Symposium on Computer Architecture, June 1998.
[10]
Tse-Yuh Yeh and Yale Patt, Alternative implementations of two-level adaptive branch prediction, In Proceedings of the 19th Annual International Symposium on Computer Architecture, May 1992.
[11]
A.-C. Lai and B. Falsafi, Memory sharing predictor: the key to a speculative coherent DSM, In Proceedings of the 26th annual international symposium on Computer architecture, pp. 172--183, 1999.
[12]
M. E. Acacio, J. González, J. M. García, and J. Duato, Owner Prediction for Accelerating Cache-to-Cache Transfers in a cc-NUMA Architecture, In Proceedings of SC2002, Nov. 2002.
[13]
A. Moshovos, B. Falsafi and A. Choudhary, JETTY: Filtering Snoops for Reduced Energy Consumption in SMP Servers, In Proceedings of the 7th International Symposium on High-Performance Computer Architecture, January 2001.
[14]
J. Cantin, A. Moshovos, M. Lipasti, J. Smith, and B. Falsafi, Coarse-Grain Coherence Tracking: RegionScout and Region Coherence Arrays, IEEE Micro, v.26, n.1, pp. 70--79, Jan--Feb 2006.
[15]
J. Huh, J. Chang, D. Burger, and G. S. Sohi, Coherence Decoupling: Making Use of Incoherence, In Proceedings of ASPLOS-XI, pp. 97--106, 2004.
[16]
Milo M. K. Martin, Pacia J. Harper, Daniel J. Sorin, Mark D. Hill, and David A. Wood, Using Destination-Set Prediction to Improve the Latency/Bandwidth Tradeoff in Shared-Memory Multiprocessors, In Proceedings of the 30th Annual International Symposium on Computer Architecture, pages 206--217, 2003.
[17]
K. M. Lepak and M. H. Lipasti, Temporally Silent Stores, In Proceedings of ASPLOS-X, pages 30--41, 2002.
[18]
H. S. Wang, X. P. Zhu, L. S. Peh, and S. Malik. Orion: A Power-Performance Simulator for Interconnection Networks. In International Symposium on Microarchitecture, Nov. 2002.
[19]
P. Shivakumar and N. Jouppi. CACTI 3.0: An Integrated Cache Timing, Power and Area Model. Technical Report 2001/2, Compaq Computer Corporation, Aug. 2001.
[20]
R. Iyer, L. N. Bhuyan and A. Nanda. "Using Switch Directories to Speed Up Cache-to-Cache Transfers in CCNUMA Multiprocessors", In Proc. of the 14th Int'l Parallel and Distributed Processing Symposium (IPDPS'00), pp. 721--728, May 2000.
[21]
L. A. Barroso, K. Gharachorloo and E. Bugnion, "Memory System Characterization of Commercial Workloads", In Proc. of the 25th Int'l Symposium on Computer Architecture (ISCA'98), pp. 3--14, June 1998.
[22]
D. E. Culler, J. Singh, A. Gupta, Parallel Computer Architecture: A Hardware/Software Approach, Morgan Kaufmann Publishers, San Francisco, Calif., 1998.
[23]
M. Bjorkman, F. Dahlgren, and P. Stenstrom, Using Hints to Reduce the Read Miss Penalty for Flat COMA Protocols, In Proceedings of the 28th Annual Hawaii International Conference of System Sciences, pages 242--251, January 1995.
[24]
A. Landin, E. Hagersten, and S. Haridi, Race-free interconnection networks and multiprocessor consistency, In Proc. of the 18th Intl. Symp. on Comp. Architecture, 1991.

Cited By

View all
  • (2012)Counting stream registers: An efficient and effective snoop filter architecture2012 International Conference on Embedded Computer Systems (SAMOS)10.1109/SAMOS.2012.6404165(120-127)Online publication date: Jul-2012

Index Terms

  1. Speculative supplier identification for reducing power of interconnects in snoopy cache coherence protocols

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    CF '07: Proceedings of the 4th international conference on Computing frontiers
    May 2007
    300 pages
    ISBN:9781595936837
    DOI:10.1145/1242531
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 07 May 2007

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. SMP
    2. cache coherence protocol
    3. interconnect
    4. power

    Qualifiers

    • Article

    Conference

    CF07
    Sponsor:
    CF07: Computing Frontiers Conference
    May 7 - 9, 2007
    Ischia, Italy

    Acceptance Rates

    Overall Acceptance Rate 273 of 785 submissions, 35%

    Upcoming Conference

    CF '25

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)1
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 06 Nov 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2012)Counting stream registers: An efficient and effective snoop filter architecture2012 International Conference on Embedded Computer Systems (SAMOS)10.1109/SAMOS.2012.6404165(120-127)Online publication date: Jul-2012

    View Options

    Get Access

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media