skip to main content
research-article

Demystifying Soft-Error Mitigation by Control-Flow Checking -- A New Perspective on its Effectiveness

Published: 27 September 2017 Publication History

Abstract

Soft errors are a challenging and urging problem in the domain of safety-critical embedded systems. For decades, checking schemes have been investigated and improved to mitigate soft-error effects for the class of control-flow faults, with current industrial standards strongly recommending their use.
However, reality looks different: Taking a systems perspective, we implemented four representative Control-Flow Checking (CFC) schemes and put them through their paces in 396 fault-injection campaigns. In contrast to previous work, which typically relied on probability-based vulnerability metrics, we accounted for the influence of memory and time overheads on the fault-space dimensions and applied those in full-scan fault injections. This change in procedure alone severely degraded the perceived effectiveness of CFC.
In addition, we expanded the perspective to data-flow faults and their influence on the overall susceptibility, an aspect that so far has been largely ignored. Our results suggest that, without accompanying measures, any improvement regarding control-flow faults is dominated by the increase in data faults caused by the increased attack surface in terms of memory and runtime overhead. Moreover, CFC performance less depended on the detection capabilities than on general aspects of the concrete binary compilation and execution.
In conclusion, incorporating CFC is not as straightforward as often assumed and the vulnerability of systems with hardened control-flow may in many cases even be increased by the schemes themselves.

References

[1]
R. Alexandersson and J. Karlsson. 2011. Fault injection-based assessment of aspect-oriented implementation of fault tolerance. In 2011 IEEE/IFIP 41st International Conference on Dependable Systems Networks (DSN). 303--314.
[2]
Z. Alkhalifa, V. S. S. Nair, N. Krishnamurthy, and J. A. Abraham. 1999. Design and evaluation of system-level checks for on-line control flow error detection. IEEE Trans. Parallel Distrib. Syst. 10, 6 (June 1999), 627--641.
[3]
S. A. Asghari, H. Taheri, H. Pedram, and O. Kaynak. 2014. Software-Based control flow checking against transient faults in Industrial Environments. IEEE Transactions on Industrial Informatics 10, 1 (Feb. 2014), 481--490.
[4]
R. Baumann. 2005. Soft errors in advanced computer systems. IEEE Design Test of Computers 22, 3 (May 2005), 258--266.
[5]
S. Y. Borkar. 2005. Designing reliable systems from unreliable components: The challenges of transistor variability and degradation. IEEE Micro 25, 6 (2005), 10--16.
[6]
P. Cheynet, B. Nicolescu, R. Velazco, M. Rebaudengo, M. Sonza Reorda, and M. Violante. 2000. Experimentally evaluating an automatic approach for generating safety-critical software with respect to transient errors. IEEE Transactions on Nuclear Science 47 (2000), 2231--2236.
[7]
J.-D. Choi, M. Gupta, M. J. Serrano, V. C. Sreedhar, and S. P. Midkiff. 2003. Stack allocation and synchronization optimizations for java using escape analysis. ACM Trans. Program. Lang. Syst. 25, 6 (Nov. 2003), 876--910.
[8]
C. Dietrich, M. Hoffmann, and D. Lohmann. 2017. Global optimization of fixed-Priority real-Time systems by RTOS-Aware control-Flow analysis. ACM Trans. Embed. Comput. Syst. 16, 2 (Jan. 2017), 35:1--35:25.
[9]
R. Feldt and A. Magazinius. 2010. Validity threats in empirical software engineering research-An Initial Survey. In SEKE. 374--379.
[10]
R. R. Ferreira, R. B. Parizi, L. Carro, and Á. F. Moreira. 2013. Compiler optimizations impact the reliability of the control-Flow of radiation-Hardened software. Journal of Aerospace Technology and Management 5, 3 (Aug. 2013), 323--334.
[11]
P. Forin. 1989. Vital coded microprocessor principles and application for various transit systems. In Symp. on Control, Computers, Communication in Transportation (CCCT’89). 79--84.
[12]
P. Gawkowski, J. Sosnowski, and B. Radko. 2005. Analyzing the effectiveness of fault hardening procedures. In 11th IEEE International On-Line Testing Symposium. 14--19.
[13]
O. Goloubeva, M. Rebaudengo, M. S. Reorda, and M. Violante. 2003. Soft-error detection using control flow assertions. In 18th IEEE International Symposium on Defect and Fault Tolerance in VLSI Systems, 2003. Proceedings. 581--588.
[14]
O. Goloubeva, M. Rebaudengo, M. S. Reorda, and M. Violante. 2005. Improved software-based processor control-flow errors detection technique. In Annual Reliability and Maintainability Symposium, 2005. Proceedings. 583--589.
[15]
O. Goloubeva, M. Rebaudengo, M. S. Reorda, and M. Violante. 2006. Software-Implemented Hardware Fault Tolerance. Springer US.
[16]
R. W. Hamming. 1950. Error detecting and error correcting codes. Bell System Technical Journal 29, 2 (1950), 147--160.
[17]
F. Irom and D. Nguyen. 2007. IEEE Transactions on Nuclear Science 54, 6 (Dec 2007), 2547--2553.
[18]
ISO 26262-9. 2011. ISO 26262-9:2011: Road vehicles -- Functional safety -- Part 9: Automotive Safety Integrity Level (ASIL)-oriented and safety-oriented analyses. ISO, Geneva, Switzerland.
[19]
S. Kim and M. A. Rouf. 2010. Modeling and evaluation of control flow vulnerability in the Embedded System. In 18th IEEE/ACM International Symposium on Modelling, Analysis 8 Simulation of Computer and Telecommunication Systems (MASCOTS 2010). IEEE Computer Society, Los Alamitos, CA, USA, 430--433.
[20]
V. Kleeberger, C. Gimmler-Dumont, C. Weis, A. Herkersdorf, D. Mueller-Gritschneder, S. Nassif, U. Schlichtmann, and N. Wehn. 2013. A cross-layer technology-based study of how memory errors impact system resilience. IEEE Micro 33, 4 (July 2013), 46--55.
[21]
X. Li, K. Shen, M. C. Huang, and L. Chu. 2007. A memory soft error measurement on production systems. In Proceedings of the USENIX Annual Technical Conference (ATC’07). USENIX Association, Berkeley, CA, USA, Article 21, 6 pages. http://dl.acm.org/citation.cfm?id=1364385.1364406.
[22]
A. Mahmood and E. J. McCluskey. 1988. Concurrent error detection using watchdog processors-A survey. IEEE TC 37 (February 1988), 160--174. Issue 2.
[23]
J. Maiz, S. Hareland, K. Zhang, and P. Armstrong. 2003. Characterization of multi-bit soft error events in advanced SRAMs. In Intern. Electron Devices Meeting (IEDM’03). IEEE Press, New York, NY, USA, 21.4.1--21.4.4.
[24]
N. Oh, P. Shirvani, and E. McCluskey. 2002. Control-flow checking by software signatures. IEEE Transactions on Reliability 51, 1 (2002), 111--122.
[25]
T. Santini, C. Borchert, C. Dietrich, H. Schirmeier, M. Hoffmann, O. Spinczyk, D. Lohmann, F. R. Wagner, and P. Rech. 2017. Effectiveness of software-based hardening for radiation-induced soft errors in real-time operating systems. Lecture Notes in Computer Science (LNCS) (2017), 3--15.
[26]
U. Schiffel, A. Schmitt, M. Süßkraut, and C. Fetzer. 2010. ANB- and ANBDmem-Encoding: Detecting hardware errors in software. In 29th Int. Conf. on Comp. Safety, Reliability, and Security (SAFECOMP’10), Erwin Schoitsch (Ed.). Springer, Heidelberg, Germany, 169--182.
[27]
H. Schirmeier, C. Borchert, and O. Spinczyk. 2015. Avoiding pitfalls in fault-Injection based comparison of program susceptibility to soft errors. In 45th Int. Conf. on Dep. Systems 8 Networks (DSN’15). IEEE, Washington, DC, USA, 12.
[28]
H. Schirmeier, M. Hoffmann, C. Dietrich, M. Lenz, D. Lohmann, and O. Spinczyk. 2015. FAIL*: An open and versatile fault-injection framework for the assessment of software-implemented hardware fault tolerance. In 12th Int. Conf. on Eur. Dep. Computing Conf. (EDCC’15), Pierre Sens (Ed.). 245--255.
[29]
A. Shrivastava, A. Rhisheekesan, R. Jeyapaul, and C. J. Wu. 2014. Quantitative analysis of control flow checking mechanisms for soft errors. In 2014 51st ACM/EDAC/IEEE Design Automation Conference (DAC). 1--6.
[30]
V. Sridharan, N. DeBardeleben, S. Blanchard, K. B. Ferreira, J. Stearley, J. Shalf, and S. Gurumurthi. 2015. Memory errors in modern systems: The good, the bad, and the ugly. In 20th Int. Conf. on Arch. Support for Programming Languages 8 Operating Systems (ASPLOS’15). ACM, New York, NY, USA.
[31]
I. Stilkerich, C. Lang, C. Erhardt, C. Bay, and M. Stilkerich. 2017. The perfect getaway: Using escape analysis in embedded real-time systems. ACM Trans. Embed. Comp. Syst. 16, Article 99 (2017), 99:1--99:30 pages. Issue 4.
[32]
I. Stilkerich, M. Strotz, C. Erhardt, M. Hoffmann, D. Lohmann, F. Scheler, and W. Schröder-Preikschat. 2013. A JVM for soft-error-prone embedded systems. In 2013 ACM SIGPLAN/SIGBED Conf. on Languages, Compilers and Tools for Embedded Systems (LCTES’13). ACM, New York, NY, USA, 21--32.
[33]
M. Stilkerich, I. Thomm, C. Wawersich, and W. Schröder-Preikschat. 2012. Tailor-made JVMs for statically configured embedded systems. Concurrency and Computation: Practice and Experience 24, 8 (2012), 789--812.
[34]
N. Theißing, D. Merli, M. Smola, F. Stumpf, and G. Sigl. 2013. Comprehensive analysis of software countermeasures against fault attacks. In Proceedings of the Conference on Design, Automation and Test in Europe (DATE’13). EDA Consortium, San Jose, CA, USA, 404--409.
[35]
I. Thomm, M. Stilkerich, R. Kapitza, D. Lohmann, and W. Schröder-Preikschat. 2011. Automated application of fault tolerance mechanisms in a component-based system. In JTRES’11: 9th Int. W’shop on Java Technologies for real-time 8 embedded systems. ACM, New York, NY, USA, 87--95.
[36]
I. Thomm, M. Stilkerich, C. Wawersich, and W. Schröder-Preikschat. 2010. KESO: An open-source multi-JVM for deeply embedded systems. In JTRES’10: 8th Int. W’shop on Java Technologies for real-time 8 embedded systems. ACM, New York, NY, USA, 109--119.
[37]
P. Ulbrich, R. Kapitza, C. Harkort, R. Schmid, and W. Schröder-Preikschat. 2011. I4Copter: An adaptable and modular quadrotor platform. In 26th ACM Symp. on Applied Computing (SAC’11). ACM, New York, NY, USA, 380--396.
[38]
N. J. Wang, J. Quek, T. M. Rafacz, and S. J. patel. 2004. Characterizing the effects of transient faults on a high-performance processor pipeline. In 34th Int. Conf. on Dep. Systems 8 Networks (DSN’04). IEEE, Washington, DC, USA, 61--70.

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Embedded Computing Systems
ACM Transactions on Embedded Computing Systems  Volume 16, Issue 5s
Special Issue ESWEEK 2017, CASES 2017, CODES + ISSS 2017 and EMSOFT 2017
October 2017
1448 pages
ISSN:1539-9087
EISSN:1558-3465
DOI:10.1145/3145508
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Journal Family

Publication History

Published: 27 September 2017
Accepted: 01 June 2017
Revised: 01 June 2017
Received: 01 March 2017
Published in TECS Volume 16, Issue 5s

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. CFC
  2. CFCSS
  3. Soft error mitigation
  4. YACCA
  5. absolute-failure-count metrics
  6. control-flow checking
  7. fault-coverage
  8. fault-injection experiments
  9. reliability metrics
  10. software-based fault tolerance

Qualifiers

  • Research-article
  • Research
  • Refereed

Funding Sources

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)27
  • Downloads (Last 6 weeks)1
Reflects downloads up to 17 Jan 2025

Other Metrics

Citations

Cited By

View all

View Options

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media