skip to main content
10.1145/2723772.2723779acmotherconferencesArticle/Chapter ViewAbstractPublication PagescosmicConference Proceedingsconference-collections
research-article

Cycle-based Model to Evaluate Consistency Protocols within a Multi-protocol Compilation Tool-chain

Published: 08 February 2015 Publication History

Abstract

Many-core processors are made by hundreds to thousands cores, distributed memories and a dedicated network on a single chip. In this context, and because of the scale of the processor, providing a shared memory system has to rely on efficient hardware mechanisms and/or data consistency protocols. Some works explored several consistency mechanisms designed for many-core processors. They lead to the conclusion that there won't exist one protocol that fits to all applications and hardware contexts. Therefore, it sounds relevant to use a multi-protocol platform, in which shared data of the application can be managed by different protocols. Protocols are chosen and configured at compile time, following a static analysis of the application and the profiling of memory accesses. In this work, we propose a high-level timed model that we use to evaluate, at compile time, the consistency protocol which has been assigned to a given application and a given Network-on-Chip (NoC). This model allows to calculate the number of NoC cycles needed for each data access, that can be turned into mean access cycles for each core or each shared data. The model is not as accurate as a cycle-based NoC simulator or an instruction set simulator. However, it is accurate enough to evaluate the impact of choosing and configuring a protocol, and its lightweight implementation allows to run within an operational research optimization loop. To validate our approach, we apply the model to compare three consistency protocols, on a 2D mesh network, compiling a parallel convolution application.

References

[1]
The kalray mppa 256 manycore processor. Kalray S.A. http://www.kalray.eu/.
[2]
Tianhe-2 (milkyway-2) - national super computer center in guangzhou. TOP500 Supercomputer Sites. http://www.top500.org/system/177999.
[3]
Open systemc iniative osci, systemc documentation., 2004.
[4]
Open virtual platforms, 2008.
[5]
A. Agarwal, L. Bao, J. Brown, B. Edwards, M. Mattina, C.-C. Miao, C. Ramey, and D. Wentzlaff. Tile processor: Embedded multicore for networking and multimedia. In Proceedings of the 19th Hot Chips Symposium, HC 2007, Stanford, California, USA, August 2007.
[6]
G. M. Amdahl. Validity of the single processor approach to achieving large scale computing capabilities. In Proceedings of the April 18-20, 1967, spring joint computer conference, AFIPS '67 (Spring), pages 483--485, New York, NY, USA, 1967. ACM.
[7]
G. Antoniu and L. Bougé. Dsm-pm2: A portable implementation platform for multithreaded dsm consistency protocols. In Proceedings of the 6th International Workshop on High-Level Parallel Programming Models and Supportive Environments, HIPS '01, pages 55--70, London, UK, UK, 2001. Springer-Verlag.
[8]
G. Antoniu, L. Bougé, M. Jan, et al. Juxmem: An adaptive supportive platform for data sharing on the grid. Scalable Computing: Practice and Experience, 6(33):45--55, 2005.
[9]
P. Aubry, P.-E. Beaucamps, F. Blanc, B. Bodin, S. Carpov, L. Cudennec, V. David, P. Doré, P. Dubrulle, B. Dupont De Dinechin, F. Galea, T. Goubier, M. Harrand, S. Jones, J.-D. Lesage, S. Louise, N. Morey Chaisemartin, T. H. Nguyen, X. Raynaud, and R. Sirdey. Extended Cyclostatic Dataflow Program Compilation and Execution for an Integrated Manycore Processor. In Alchemy 2013 - Architecture, Languages, Compilation and Hardware support for Emerging ManYcore systems, volume 18, pages 1624--1633, Barcelona, Espagne, June 2013.
[10]
J. Bennett, J. Carter, and W. Zwaenepoel. Munin: Distributed shared memory using multi-protocol release consistency. In A. Karshmer and J. Nehmer, editors, Operating Systems of the 90s and Beyond, volume 563 of Lecture Notes in Computer Science, pages 56--60. Springer Berlin Heidelberg, 1991.
[11]
J. Chang and G. Sohi. Cooperative caching for chip multiprocessors. In Computer Architecture, 2006. ISCA'06. 33rd International Symposium on, pages 264--276. IEEE, 2006.
[12]
G. Chrysos. Intel xeon phi coprocessor (codename knights corner). In Proceedings of the 24th Hot Chips Symposium, HC 2012, Stanford, California, USA, August 2012.
[13]
S. Dahmani, L. Cudennec, and G. Gogniat. Introducing a data sliding mechanism for cooperative caching in manycore architectures. Proceedings of the 18th International Workshop on High-Level Parallel Programming Models and Supportive Environments, pages 335--344, 2013.
[14]
S. Dahmani, L. Cudennec, S. Louise, and G. Gogniat. Using the spring physical model to extend a cooperative caching protocol for many-core processors. Proceeding of the IEEE International Symposium on Embedded Multicore/Many-core, 2014.
[15]
X. Ding, K. Wang, and X. Zhang. Srm-buffer: an os buffer management technique to prevent last level cache from thrashing in multicores. In Proceedings of the sixth conference on Computer systems, pages 243--256. ACM, 2011.
[16]
N. Genko, D. Atienza, G. De Micheli, J. M. Mendias, R. Hermida, and F. Catthoor. A complete network-on-chip emulation framework. In Design, Automation and Test in Europe, 2005. Proceedings, pages 246--251. IEEE, 2005.
[17]
J. L. Hennessy and D. A. Patterson. Computer architecture: a quantitative approach. Elsevier, 2012.
[18]
A. Jaleel, K. B. Theobald, S. C. Steely Jr, and J. Emer. High performance cache replacement using re-reference interval prediction (rrip). In ACM SIGARCH Computer Architecture News, volume 38, pages 60--71. ACM, 2010.
[19]
K. Li and P. Hudak. Memory coherence in shared virtual memory systems. ACM Trans. Comput. Syst., 7(4):321--359, Nov. 1989.
[20]
C. Luk, R. Cohn, R. Muth, H. Patil, A. Klauser, G. Lowney, S. Wallace, V. Reddi, and K. Hazelwood. Pin: building customized program analysis tools with dynamic instrumentation. In ACM SIGPLAN Notices, volume 40, pages 190--200. ACM, 2005.
[21]
T. Marescaux, J.-Y. Mignolet, A. Bartic, W. Moffat, D. Verkest, S. Vernalde, and R. Lauwereins. Networks on chip as hardware components of an os for reconfigurable systems. In Field Programmable Logic and Application, pages 595--605. Springer, 2003.
[22]
S. A. McKee. Reflections on the memory wall. In Proceedings of the 1st conference on Computing frontiers, CF '04, pages 162--, New York, NY, USA, 2004. ACM.
[23]
J. Meng and K. Skadron. Avoiding cache thrashing due to private data placement in last-level cache for manycore scaling. In Computer Design, 2009. ICCD 2009. IEEE International Conference on, pages 282--288. IEEE, 2009.
[24]
D. A. Patterson and J. L. Hennessy. Computer organization and design: the hardware/software interface. Newnes, 2013.
[25]
J. Sepúlveda, M. Strum, and J. Wang. A tlm-based network-on-chip performance evaluation framework. Proc. 3rd Symposium on Circuits and Systems, Colombian Chapter, pages 54--60, 2007.
[26]
V. Seshadri, O. Mutlu, M. A. Kozuch, and T. C. Mowry. The evicted-address filter: A unified mechanism to address both cache pollution and thrashing. 2012.
[27]
N. Ventroux, A. Guerre, T. Sassolas, L. Moutaoukil, G. Blanc, C. Bechara, and R. David. Sesam: An mpsoc simulation environment for dynamic application processing. In Computer and Information Technology (CIT), 2010 IEEE 10th International Conference on, pages 1880--1886. IEEE, 2010.
[28]
P. T. Wolkotte, P. K. Holzenspies, and G. J. Smit. Fast, accurate and detailed noc simulations. In Networks-on-Chip, 2007. NOCS 2007, pages 323--332. IEEE, 2007.

Cited By

View all
  • (2016)Network Contention-Aware Method to Evaluate Data Coherency Protocols within a Compilation Toolchain2016 IEEE 10th International Symposium on Embedded Multicore/Many-core Systems-on-Chip (MCSOC)10.1109/MCSoC.2016.54(249-256)Online publication date: Sep-2016

Index Terms

  1. Cycle-based Model to Evaluate Consistency Protocols within a Multi-protocol Compilation Tool-chain

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Other conferences
      COSMIC '15: Proceedings of the 2015 International Workshop on Code Optimisation for Multi and Many Cores
      February 2015
      74 pages
      ISBN:9781450333160
      DOI:10.1145/2723772
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 08 February 2015

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. Cache coherence
      2. Compilation
      3. Many-cores
      4. Performance Evaluation

      Qualifiers

      • Research-article
      • Research
      • Refereed limited

      Conference

      COSMIC '15

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)2
      • Downloads (Last 6 weeks)0
      Reflects downloads up to 20 Jan 2025

      Other Metrics

      Citations

      Cited By

      View all
      • (2016)Network Contention-Aware Method to Evaluate Data Coherency Protocols within a Compilation Toolchain2016 IEEE 10th International Symposium on Embedded Multicore/Many-core Systems-on-Chip (MCSOC)10.1109/MCSoC.2016.54(249-256)Online publication date: Sep-2016

      View Options

      Login options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media