research-article

High Performance and Predictable Shared Last-level Cache for Safety-Critical Systems

Authors:

Zhuanhao Wu,

Anirudh Kaushik,

Hiren PatelAuthors Info & Claims

ACM Transactions on Embedded Computing Systems, Volume 23, Issue 6

Article No.: 97, Pages 1 - 30

https://doi.org/10.1145/3687308

Published: 11 September 2024 Publication History

Get Access

Abstract

We propose ZeroCost-LLC (ZCLLC), a novel shared inclusive last-level cache (LLC) design for timing predictable multi-core platforms that offers lower worst-case latency (WCL) when compared with a traditional shared inclusive LLC design. ZCLLC achieves low WCL by eliminating certain memory operations in the form of cache line invalidations across the cache hierarchy that are a consequence of a core’s memory request that misses in the cache hierarchy and when there is no vacant entry in the LLC to accommodate the fetched data for this request. In addition to low WCL, ZCLLC offers performance benefits in the form of additional caching capacity and unlike state-of-the-art approaches, ZCLLC does not impose any constraints on its usage across multiple cores. In this work, we describe the impact of LLC cache line invalidations on the WCL and systematically build solutions to eliminate these invalidations resulting in ZCLLC. We also present ZCLLC-OPT, an optimized variant of ZCLLC that offers lower WCL and improved average-case performance over ZCLLC. We apply optimizations to the shared bus arbitration mechanism and extend the micro-architecture of ZCLLC to allow for overlapping memory requests to the main memory. Our analysis reveals that the analytical WCL of a memory request under ZCLLC-OPT is 87.0%, 93.8%, and 97.1% lower than that under state-of-the-art LLC partition sharing techniques for 2, 4, and 8 cores, respectively. ZCLLC-OPT shows average-case performance speedups of 1.89×, 3.36×, and 6.24× compared with the state-of-the-art LLC partition sharing techniques for 2, 4, and 8 cores, respectively. When compared with the original ZCLLC that does not have any optimizations, ZCLLC-OPT shows lower analytical WCLs that are 76.5%, 82.6%, and 86.2% lower compared with ZCLLC-NORMAL for 2, 4, and 8 cores, respectively.

References

[1]

Benny Akesson, Mitra Nasri, Geoffrey Nelissen, Sebastian Altmeyer, and Robert I. Davis. 2021. A comprehensive survey of industry practice in real-time systems. Real-Time Systems 58, 3 (2021), 358–398.

Abstract

References

Index Terms

Recommendations

Predictable sharing of last-level cache partitions for multi-core safety-critical systems

Block value based insertion policy for high performance last-level caches

Optimal bypass monitor for high performance last-level caches

Comments

Information

Published In

Publisher

Journal Family

Publication History

Check for updates

Author Tags

Qualifiers

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

Login options

Full Access

View options

PDF

eReader

Full Text

Figures

Other

Share

Share this Publication link

Share on social media

Affiliations