skip to main content
research-article

Accurately Measuring Contention in Mesh NoCs in Time-Sensitive Embedded Systems

Published: 03 April 2023 Publication History

Abstract

The computing capacity demanded by embedded systems is on the rise as software implements more functionalities, ranging from best-effort entertainment functions to performance-guaranteed safety-related functions. Heterogeneous manycore processors, using wormhole mesh (wmesh) Network-on-Chips (NoCs) as the main communication means, and contention block among applications, are increasingly considered to deliver the required computing performance. Most research efforts on software timing analysis have focused on deriving bounds (estimates) to the contention that tasks can suffer when accessing wmesh NoCs. However, less effort has been devoted to an equally important problem, namely, accurately measuring the actual contention tasks generate each other on the wmesh which is instrumental during system validation to diagnose any software timing misbehavior and determine which tasks are particularly affected by contention on specific wmesh routers. In this article, we work on the foundations of contention measuring in wmesh NoCs and propose and explain the rationale of a golden metric, called task PairWise Contention (PWC). PWC allows ascribing the actual share of the contention a given task suffers in the wmesh to each of its co-runner tasks at packet level. We also introduce and formalize a Golden Reference Value (GRV) for PWC that specifically defines a criterion to fairly break down the contention suffered by a task among its co-runner tasks in the wmesh. Our evaluation shows that GRV effectively captures how contention occurs by identifying the actual core (task) causing contention and whether contention is caused by local or remote interference in the wmesh.

References

[1]
NaNoC. 2010-2012. NaNoC Project. Retrieved from https://sites.google.com/site/nanocproject/. Accessed February 12, 2023.
[2]
Apollo. 2018. Apollo, an open autonomous driving platform. Retrieved from https://developer.apollo.auto/index.html. Accessed February 12, 2023.
[3]
NVIDIA. 2018. NVIDIA drive platforms. Retrieved from https://www.nvidia.com/en-us/self-driving-cars/. Accessed February 12, 2023.
[4]
Kalray. 2022. Kalray MPPA®manycore. a massively parallel processor array architecture. Retrieved from https://www.kalrayinc.com/products/mppa-technology. Accessed February 12, 2023.
[5]
Xilinx. 2022. System-level benefits of the versal platform. Retrieved from https://www.xilinx.com/content/dam/xilinx/support/documentation/white_papers/wp539-versal-system-level-benefits.pdf. Accessed February 12, 2023.
[6]
A. Agrawal, G. Fohler, J. Freitang, J. Nowotsch, S. Uhrig, and M. Paulitsch. 2017. Contention-aware dynamic memory bandwidth isolation with predictability in COTS multicores: An avionics case study. In Proceedings of the Euromicro Conference on Real-Time Systems.
[7]
J. Cardona, C. Hernández, J. Abella, and F. J. Cazorla. 2019. Maximum-contention control unit (MCCU): Resource access count and contention time enforcement. In Proceedings of the 2019 Design, Automation Test in Europe Conference Exhibition. 710–715.
[8]
J. Cardona, C. Hernandez, E. Mezzetti, J. Abella, and F. J. Cazorla. 2018. EOmesh: Combined flow balancing and deterministic routing for reduced WCET estimates in embedded real-time systems. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems 37, 11 (2018), 2451–2461. DOI:
[9]
J. Cardona, C. Hernandez, E. Mezzetti, J. Abella, and F. J. Cazorla. 2018. NoCo: ILP-based worst-case contention estimation for mesh real-time manycores. In Proceedings of the 2018 IEEE Real-Time Systems Symposium. DOI:
[10]
European Union Aviation Safety Agency (EASA). 2022. AMC 20-193 Use of multi-core processors. https://www.easa.europa.eu/en/downloads/134960/en. Accessed February 12, 2023.
[11]
Cobham Gaisler. 2011. Quad Core LEON4 SPARC V8 Processor. https://www.gaisler.com/index.php/products/components/gr740. Accessed February 12, 2023.
[12]
D. Dasari, B. Andersson, V. Nelis, S. M. Petters, A. Easwaran, and J. Lee. 2011. Response time analysis of COTS-based multicores considering the contention on the shared memory bus. In Proceedings of the 2011 IEEE 10th International Conference on Trust, Security and Privacy in Computing and Communications.
[13]
D. Dasari, B. Nikoli’c, V. N’elis, and S. M. Petters. 2014. NoC contention analysis using a branch-and-prune algorithm. ACM Transactions on Embedded Computing Systems 13, 3s, Article 113 (March2014), 26 pages. DOI:
[14]
B. Dupont de Dinechin and A. Graillat. 2017. Network-on-chip service guarantees on the kalray MPPA-256 bostan processor. In Proceedings of the 2nd International Workshop on Advanced Interconnect Solutions and Technologies for Emerging Computing Systems. ACM, New York, NY, 35–40. DOI:
[15]
E. Díaz, M. Fernández, L. Kosmidis, E. Mezzetti, C. Hernández, J. Abella, and F. J. Cazorla. 2017. MC2: Multicore and cache analysis via deterministic and probabilistic jitter bounding. In Proceedings of the International Conference on Reliable Software Technologies.
[16]
J. Duato, S. Yalamanchili, and N. Lionel. 2002. Interconnection Networks: An Engineering Approach. Morgan Kaufmann Publishers Inc., San Francisco, CA.
[17]
EPI Consortium. 2019-2025. European Processor Initiative. Retrieved from https://www.european-processor-initiative.eu/. Accessed February 12, 2023.
[18]
M. Feilhauer, J. Haering, and S. Wyatt. 2016. Current approaches in HiL-based ADAS testing. SAE International Journal of Commercial Vehicles 9, 2 (2016), 63–70.
[19]
M. Fernández, R. Gioiosa, E. Quiñones, L. Fossati, M. Zulianello, and F. J. Cazorla. 2012. Assessing the suitability of the NGMP multi-core processor in the space domain. In Proceedings of the 10th ACM International Conference on Embedded Software.
[20]
International Organization for Standardization. 2018. ISO/DIS 26262. Road Vehicles - Funtional Safety.
[21]
G. Giannopoulou, N. Stoimenov, P. Huang, and L. Thiele. 2013. Scheduling of mixed-criticality applications on resource-sharing multicore systems. In Proceedings of the 11th ACM International Conference on Embedded Software. IEEE, Article 17, 15 pages.
[22]
F. Gilabert, M. E. Gómez, S. Medardoni, and D. Bertozzi. 2010. Improved utilization of NoC channel bandwidth by switch replication for cost-effective multi-processor systems-on-chip. In Proceedings of the 2010 4th ACM/IEEE International Symposium on Networks-on-Chip. 165–172. DOI:
[23]
F. Giroudot and A. Mifdaoui. 2018. Buffer-aware worst-case timing analysis of wormhole NoCs using network calculus. In Proceedings of the 2018 IEEE Real-Time and Embedded Technology and Applications Symposium. 37–48. DOI:
[24]
F. Giroudot and A. Mifdaoui. 2019. Tightness and computation assessment of worst-case delay bounds in wormhole networks-on-chip. In Proceedings of the 27th International Conference on Real-Time Networks and Systems. ACM, New York, NY, 19–29. DOI:
[25]
F. Giroudot and A. Mifdaoui. 2020. Graph-based approach for buffer-aware timing analysis of heterogeneous wormhole NoCs under bursty traffic. IEEE Access 8 (2020), 32442–32463. DOI:
[26]
K. Goossens, J. Dielissen, and A. Radulescu. 2005. A Ethereal network on chip: Concepts, architectures, and implementations. IEEE Design Test of Computers 22, 5 (Sep.2005), 414–421. DOI:
[27]
A. Hansson, K. Goossens, M. Bekooij, and J. Huisken. 2009. CoMPSoC: A template for composable and predictable multi-processor system on chips. ACM Transactions on Design Automation of Electronic Systems 14, 1 (2009), 2:1–2:24.
[28]
C. Hernández, J. Abella, F. J. Cazorla, A. Baradizbanyan, J. Andersson, F. Cros, and F. Wartel. 2017. Design and implementation of a time predictable processor: Evaluation with a space case study. In Proceedings of the Euromicro Conference on Real-Time Systems.
[29]
Infineon. 2022. AURIX Multicore 32-bit Microcontrollers for automotive and industrial applications. https://www.infineon.com/dgdl/Infineon-TriCore_Family_BR-ProductBrochure-v01_00-EN.pdf?fileId=5546d4625d5945ed015dc81f47b436c7. Accessed February 12, 2023.
[30]
F. Jafari, Z. Lu, A. Jantsch, and M. H. Yaghmaee. 2010. Buffer optimization in network-on-chip through flow regulation. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems 29, 12 (2010), 1973–1986. DOI:
[31]
J. Jalle, J. Abella, E. Quiñones, L. Fossati, M. Zulianello, and F. J. Cazorla. 2013. Deconstructing bus access control policies for real-time multicores. In Proceedings of the 2013 8th IEEE International Symposium on Industrial Embedded Systems. 31–38. DOI:
[32]
J. Jalle, M. Fernandez, J. Abella, J. Andersson, M. Patte, L. Fossati, M. Zulianello, and F. J. Cazorla. 2015. Bounding resource contention interference in the next-generation microprocessor (NGMP). In Proceedings of the Embedded Real-Time Systems.
[33]
A. E. Kiasari, Z. Lu, and A. Jantsch. 2013. An analytical latency model for networks-on-chip. IEEE Transactions on Very Large Scale Integration (VLSI) Systems 21, 1 (2013), 113–123. DOI:
[34]
A. Kostrzewa, S. Saidi, L. Ecco, and R. Ernst. 2015. Flexible TDM-based resource management in on-chip networks. In Proceedings of the 23rd International Conference on Real Time and Networks Systems. ACM, New York, NY, 151–160. DOI:
[35]
J. Le Boudec and P. Thiran. 2001. Network Calculus: A Theory of Deterministic Queuing Systems for the Internet. Springer-Verlag, Berlin.
[36]
S. Lee. 2003. Real-time wormhole channels. Journal of Parallel and Distributed Computing 63, 3 (2003), 299–311.
[37]
M. Liu, M. Becker, M. Behnam, and T. Nolte. 2017. A tighter recursive calculus to compute the worst case traversal time of real-time traffic over NoCs. In Proceedings of the 2017 22nd Asia and South Pacific Design Automation Conference. 275–282.
[38]
E. Mezzetti, L. Kosmidis, J. Abella, and F. J. Cazorla. 2018. High-integrity performance monitoring units in automotive chips for reliable timing V&V. IEEE Micro 38, 1 (2018), 56–65. DOI:
[39]
T. Moseley, J. L. Kihm, D. A. Connors, and D. Grunwald. 2005. Methods for modeling resource contention on simultaneous multithreading processors. In Proceedings of the IEEE International Conference on Computer Design.
[40]
B. Nikolić, S. Tobuschat, Leandro Soares I., R. Ernst, and A. Burns. 2019. Real-time analysis of priority-preemptive NoCs with arbitrary buffer sizes and router delays. Real-Time Systems 55, 1 (Jan.2019), 63–105. DOI:
[41]
J. Nowotsch and M. Paulitsch. 2012. Leveraging multi-core computing architectures in avionics. In Proceedings of the European Dependable Computing Conference. IEEE Computer Society, 132–143.
[42]
J. Nowotsch, M. Paulitsch, D. Bühler, H. Theiling, S. Wegener, and M. Schmidt. 2014. Multi-core interference-sensitive WCET analysis leveraging runtime resource capacity enforcement. In Proceedings of the Euromicro Conference on Real-Time Systems.
[43]
U. Y. Ogras, P. Bogdan, and R. Marculescu. 2010. An analytical approach for network-on-chip performance analysis. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems 29, 12 (2010), 2001–2013. DOI:
[44]
M. Panic, C. Hernandez, E. Quinones, J. Abella, and F. J. Cazorla. 2016. Modeling high-performance wormhole NoCs for critical real-time embedded systems. In Proceedings of the 2016 IEEE Real-Time and Embedded Technology and Applications Symposium. 1–12. DOI:
[45]
M. Panić, C. Hernandez, J. Abella, A. Roca, E. Quiñones, and F. J. Cazorla. 2016. Improving performance guarantees in wormhole mesh NoC designs. In Proceedings of the 2016 Design, Automation Test in Europe Conference Exhibition. 1485–1488.
[46]
R. Pellizzoni, A. Schranzhofer, Jian-Jia Chen, M. Caccamo, and L. Thiele. 2010. Worst case delay analysis for memory interference in multicore systems. In Proceedings of the Design, Automation Test in Europe Conference Exhibition.
[47]
T. Picornell, J. Flich, C. Hernández, and J. Duato. 2019. DCFNoC: A delayed conflict-free time division multiplexing network on chip. In Proceedings of the IEEE Design Automation Conference.
[48]
Y. Qian, Z. Lu, and W. Dou. 2009. Analysis of worst-case delay bounds for best-effort communication in wormhole networks on chip. In Proceedings of the IEEE/ACM International Symposium on Networks-on-Chip.
[49]
M. Caccamo, L. Sha, R. Mancuso, R. Pellizzoni, and H. Yun. 2015. WCET(m) estimation in multi-core systems using single core equivalence. In Proceedings of the Euromicro Conference on Real-Time Systems.
[50]
D. Rahmati, S. Murali, L. Benini, F. Angiolini, G. De Micheli, and H. Sarbazi-Azad. 2013. Computing accurate performance bounds for best effort networks-on-chip. IEEE Transactions on Computers 62, 3 (March2013), 452–467. DOI:
[51]
E. A. Rambo and R. Ernst. 2015. Worst-case communication time analysis of networks-on-chip with shared virtual channels. In Proceedings of the 2015 Design, Automation Test in Europe Conference Exhibition. 537–542. DOI:
[52]
S. Ramos and T. Hoefler. 2017. Capability models for manycore memory systems: A case-study with Xeon Phi KNL. In Proceedings of the 2017 IEEE International Parallel and Distributed Processing Symposium. 297–306. DOI:
[53]
M. Schoeberl and A. Rocha. 2014. T-CREST: A time-predictable multi-core platform for aerospace applications. In Proceedings of Data Systems in Aerospace.
[54]
A. Serrano-Cases, J. M. Reina, J. Abella, E. Mezzetti, and F. J. Cazorla. 2021. Leveraging hardware QoS to control contention in the Xilinx Zynq UltraScale+ MPSoC. In Proceedings of the 33rd Euromicro Conference on Real-Time Systems, ECRTS 2021, July 5-9, 2021, Virtual Conference(LIPIcs, Vol. 196). Björn B. Brandenburg (Ed.), Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 3:1–3:26. DOI:
[55]
Z. Shi and A. Burns. 2008. Real-Time communication analysis for On-Chip networks with wormhole switching. In Proceedings of the 2nd ACM/IEEE International Symposium on Networks-on-Chip. 161–170. DOI:
[56]
SoCLib. 2003-2012. -. Retrieved from http://www.soclib.fr/trac/dev. Accessed February 12, 2023.
[57]
J. Sparsoe. 2012. Design of networks-on-chip for real-time multi-processor systems-on-chip. In Proceedings of the International Conference on Application of Concurrency to System Design.
[58]
Tilera. 2013. TILE-Gx Processors Family. Retrieved from https://caxapa.ru/thumbs/281914/TILE-Gx_Processor_PB025_v4_0.pdf. Accessed February 12, 2023.
[59]
S. Tobuschat and R. Ernst. 2017. Real-time communication analysis for networks-on-chip with backpressure. In Proceedings of the 2017 Design, Automation Test in Europe Conference Exhibition. 590–595. DOI:
[60]
T. Ungerer, F. J. Cazorla, P. Sainrat, G. Bernat, Z. Petrov, C. Rochange, E. Quiñones, M. Gerdes, M. Paolieri, J. Wolf, H. Cassé, S. Uhrig, I. Guliashvili, M. Houston, F. Kluge, S. Metzlaff, and J. Mische. 2010. Merasa: Multicore execution of hard real-time applications supporting analyzability. IEEE Micro 30, 5 (2010), 66–75.
[61]
X. Xiang, S. Ghose, O. Mutlu, and N. Tzeng. 2016. A model for application slowdown estimation in on-chip networks and its use for improving system fairness and performance. In Proceedings of the 2016 IEEE 34th International Conference on Computer Design. 456–463. DOI:
[62]
Y. Xiao, S. Nazarian, and P. Bogdan. 2019. Self-optimizing and self-programming computing systems: A combined compiler, complex networks, and machine learning approach. IEEE Transactions on Very Large Scale Integration (VLSI) Systems 27, 6 (2019), 1416–1427. DOI:
[63]
Y. Xiao, Y. Xue, S. Nazarian, and P. Bogdan. 2017. A load balancing inspired optimization framework for exascale multicore systems: A complex networks approach. In Proceedings of the 2017 IEEE/ACM International Conference on Computer-Aided Design. 217–224. DOI:
[64]
XILINX. 2018. Rockwell Collins Uses Zynq UltraScale+ RFSoC Devices in Revolutionizing How Arrays are Produced and Fielded: Powered by Xilinx. Retrieved from https://www.xilinx.com/video/corporate/rockwell-collins-rfsoc-revolutionizing-how-arrays-are-produced.html. Accessed February 12, 2023.
[65]
Q. Xiong, F. Wu, Z. Lu, and C. Xie. 2017. Extending real-time analysis for wormhole NoCs. IEEE Transactions on Computers 66, 9 (2017), 1532–1546. DOI:
[66]
H. Yun, G. Yao, R. Pellizzoni, M. Caccamo, and L. Sha. 2013. MemGuard: Memory bandwidth reservation system for efficient performance isolation in multi-core platforms. In Proceedings of the Real-Time and Embedded Technology and Applications Symposium.
[67]
M. Zimmer, D. Broman, C. Shaver, and E. A. Lee. 2014. FlexPRET: A processor platform for mixed-criticality systems. In Proceedings of the Real-Time and Embedded Technology and Applications Symposium. 101–110.

Cited By

View all
  • (2024)Research and Application of Side Channel Attacks and Defenses on Embedded Systems2024 International Conference on Integrated Circuits and Communication Systems (ICICACS)10.1109/ICICACS60521.2024.10498374(1-5)Online publication date: 23-Feb-2024

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Design Automation of Electronic Systems
ACM Transactions on Design Automation of Electronic Systems  Volume 28, Issue 3
May 2023
456 pages
ISSN:1084-4309
EISSN:1557-7309
DOI:10.1145/3587887
Issue’s Table of Contents

Publisher

Association for Computing Machinery

New York, NY, United States

Journal Family

Publication History

Published: 03 April 2023
Online AM: 24 January 2023
Accepted: 14 January 2023
Revised: 11 January 2023
Received: 28 July 2022
Published in TODAES Volume 28, Issue 3

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Timing validation & verification
  2. contention
  3. contention breakdown
  4. NoCs

Qualifiers

  • Research-article

Funding Sources

  • Spanish Ministry of Science and Innovation
  • European Research Council (ERC)

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)83
  • Downloads (Last 6 weeks)9
Reflects downloads up to 16 Oct 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Research and Application of Side Channel Attacks and Defenses on Embedded Systems2024 International Conference on Integrated Circuits and Communication Systems (ICICACS)10.1109/ICICACS60521.2024.10498374(1-5)Online publication date: 23-Feb-2024

View Options

Get Access

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Full Text

View this article in Full Text.

Full Text

HTML Format

View this article in HTML Format.

HTML Format

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media