skip to main content
research-article

A Robustness-Based Confidence Measure for Hybrid System Falsification

Published: 01 May 2023 Publication History

Abstract

Verification of hybrid systems is very challenging, if not impossible, due to their continuous dynamics that leads to infinite state space. As a countermeasure, falsification is usually applied to show that a specification does not hold, by searching for a falsifying input as a counterexample that refutes the specification. A falsification algorithm exploits the quantitative robust semantics of temporal specifications, which provides a numerical robustness that tells how robustly a specification holds or not, and uses it as a guide to explore the input space toward the direction of robustness descent—once negative robustness is observed, it indicates that a falsifying input is found. However, if a falsification algorithm does not return any falsifying input, a user is not sure whether the specification does indeed hold, or there exist counterexamples that the algorithm did not manage to reach. In this case, a measurement on how likely there indeed exists no counterexample in the input space is necessary for better understanding the safety of the system and deciding whether more budget should be allocated for the falsification. To this end, we propose a confidence measure that assesses the likelihood that the system is not falsifiable, i.e., how confident a user should be that a specification holds, given the fact that an algorithm has sampled a set of inputs but did not find any falsifying one. The confidence measure is defined in terms of a coverage criterion of the input space that assesses to which extent the whole input space is explored and a local area is exploited where low robustness is observed. Experiments on commonly used falsification benchmarks show that our proposed confidence measure is reasonable and can distinguish different specifications.

References

[1]
G. E. Fainekos and G. J. Pappas, “Robustness of temporal logic specifications for continuous-time signals,” Theor. Comput. Sci., vol. 410, no. 42, pp. 4262–4291, 2009.
[2]
A. Donzé and O. Maler, “Robust satisfaction of temporal logic over real-valued signals,” in Proc. 8th Int. Conf. Formal Model. Anal. Timed Syst., 2010, pp. 92–106.
[3]
Y. Yamagata, S. Liu, T. Akazaki, Y. Duan, and J. Hao, “Falsification of cyber-physical systems using deep reinforcement learning,” IEEE Trans. Softw. Eng., vol. 47, no. 12, pp. 2823–2840, Dec. 2021.
[4]
G. Ernst, S. Sedwards, Z. Zhang, and I. Hasuo, “Fast falsification of hybrid systems using probabilistically adaptive input,” in Proc. 16th Int. Conf. Quantitative Eval. Syst. (QEST), 2019, pp. 165–181.
[5]
C. Menghi, S. Nejati, L. Briand, and Y. I. Parache, “Approximation-refinement testing of compute-intensive cyber-physical models: An approach based on system identification,” in Proc. ACM/IEEE 42nd Int. Conf. Softw. Eng., New York, NY, USA, 2020, pp. 372–384.
[6]
Y. Annpureddy, C. Liu, G. Fainekos, and S. Sankaranarayanan, “S-TaLiRo: A tool for temporal logic falsification for hybrid systems,” in Proc. 17th Int. Conf. Tools Algorithms Constr. Anal. Syst. (TACAS), 2011, pp. 254–257.
[7]
A. Adimoolam, T. Dang, A. Donzé, J. Kapinski, and X. Jin, “Classification and coverage-based falsification for embedded control systems,” in Proc. 29th Int. Conf. Comput.-Aided Verif. (CAV), 2017, pp. 483–503.
[8]
J. Deshmukh, X. Jin, J. Kapinski, and O. Maler, “Stochastic local search for falsification of hybrid systems,” in Proc. 13th Int. Symp. Autom. Technol. Verif. Anal. (ATVA), 2015, pp. 500–517.
[9]
Z. Zhang, D. Lyu, P. Arcaini, L. Ma, I. Hasuo, and J. Zhao, “Effective hybrid system falsification using Monte Carlo tree search guided by QB-robustness,” in Proc. 33rd Int. Conf. Comput.-Aided Verif. (CAV), 2021, pp. 595–618.
[10]
Z. Zhang, I. Hasuo, and P. Arcaini, “Multi-armed bandits for Boolean connectives in hybrid system falsification,” in Proc. 31st Int. Conf. Comput.-Aided Verif. (CAV), 2019, pp. 401–420.
[11]
Z. Zhang, P. Arcaini, and I. Hasuo, “Hybrid system falsification under (in)equality constraints via search space transformation,” IEEE Trans. Comput.-Aided Design Integr. Circuits Syst., vol. 39, no. 11, pp. 3674–3685, Nov. 2020.
[12]
Z. Zhang, P. Arcaini, and I. Hasuo, “Constraining counterexamples in hybrid system falsification: Penalty-based approaches,” in Proc. NASA Formal Methods 12th Int. Symp. (NFM), 2020, pp. 401–419.
[13]
J. Kapinski, J. V. Deshmukh, X. Jin, H. Ito, and K. Butts, “Simulation-based approaches for verification of embedded control systems: An overview of traditional and advanced modeling, testing, and verification techniques,” IEEE Control Syst., vol. 36, no. 6, pp. 45–64, Dec. 2016.
[14]
G. Ernstet al., “ARCH-COMP 2021 category report: Falsification with validation of results,” in Proc. 8th Int. Workshop Appl. Verif. Continuous Hybrid Syst. (ARCH), 2021, pp. 133–152.
[15]
A. Donzé, “Breach, a toolbox for verification and parameter synthesis of hybrid systems,” in Proc. 22nd Int. Conf. Comput.-Aided Verif., 2010, pp. 167–170.
[16]
Z. Zhang, G. Ernst, S. Sedwards, P. Arcaini, and I. Hasuo, “Two-layered falsification of hybrid systems guided by Monte Carlo tree search,” IEEE Trans. Comput.-Aided Design Integr. Circuits Syst., vol. 37, no. 11, pp. 2894–2905, Nov. 2018.
[17]
P. Skruch, “A coverage metric to evaluate tests for continuous-time dynamic systems,” Open Eng., vol. 1, no. 2, pp. 174–180, 2011.
[18]
E. J. Weyuker, “Axiomatizing software test data adequacy,” IEEE Trans. Softw. Eng., vol. 12, no. 12, pp. 1128–1138, Dec. 1986.
[19]
G. Seroussi and N. H. Bshouty, “Vector sets for exhaustive testing of logic circuits,” IEEE Trans. Inf. Theory, vol. 34, no. 3, pp. 513–522, May 1988.
[20]
C.-H. Cheng, C.-H. Huang, and H. Yasuoka, “Quantitative projection coverage for testing ML-enabled autonomous systems,” in Proc. 16th Int. Symp. Autom. Technol. Verif. Anal. (ATVA), 2018, pp. 126–142.
[21]
M. B. Cohen, P. B. Gibbons, W. B. Mugridge, C. J. Colbourn, and J. S. Collofello, “A variable strength interaction testing of components,” in Proc. 27th Annu. Int. Comput. Softw. Appl. Conf. (COMPAC), 2003, pp. 413–418.
[22]
C. Nie and H. Leung, “A survey of combinatorial testing,” ACM Comput. Surv., vol. 43, no. 2, pp. 1–29, Feb. 2011.
[23]
Z. Zhang and P. Arcaini, “Gaussian process-based confidence estimation for hybrid system falsification,” in Proc. 24th Int. Symp. Formal Methods (FM), 2021, pp. 330–348.
[24]
E. Bartocci, R. Bloem, B. Maderbacher, N. Manjunath, and D. Ničković, “Adaptive testing for specification coverage in CPS models,” IFAC-PapersOnLine, vol. 54, no. 5, pp. 229–234, 2021.
[25]
R. Matinnejad, S. Nejati, L. C. Briand, and T. Bruckmann, “Test generation and test prioritization for simulink models with dynamic behavior,” IEEE Trans. Softw. Eng., vol. 45, no. 9, pp. 919–944, Sep. 2019.
[26]
A. Dokhanchi, A. Zutshi, R. T. Sriniva, S. Sankaranarayanan, and G. Fainekos, “Requirements driven falsification with coverage metrics,” in Proc. Int. Conf. Embedded Softw. (EMSOFT), 2015, pp. 31–40.
[27]
T. Dreossi, T. Dang, A. Donzé, J. Kapinski, X. Jin, and J. V. Deshmukh, “Efficient guiding strategies for testing of temporal properties of hybrid systems,” in Proc. NASA Formal Methods 7th Int. Symp. (NFM), 2015, pp. 127–142.
[28]
C. E. Tuncali, G. Fainekos, H. Ito, and J. Kapinski, “Simulation-based adversarial test generation for autonomous vehicles with machine learning components,” in Proc. IEEE Intell. Veh. Symp. (IV), 2018, pp. 1555–1562.
[29]
Y. T. Chenet al., “Revisiting the relationship between fault detection, test adequacy criteria, and test set size,” in Proc. 35th IEEE/ACM Int. Conf. Autom. Softw. Eng., New York, NY, USA, 2020, pp. 237–249.
[30]
M. Papadakis, M. Kintis, J. Zhang, Y. Jia, Y. Le Traon, and M. Harman, “Chapter six—Mutation testing advances: An analysis and survey,” in Advances in Computers, vol. 112. Amsterdam, The Netherlands: Elsevier, 2019, pp. 275–378.
[31]
H. Hemmati, “How effective are code coverage criteria?” in Proc. IEEE Int. Conf. Softw. Qual. Rel. Security, Vancouver, BC, Canada, 2015, pp. 151–156.
[32]
R. Just, D. Jalali, and M. D. Ernst, “Defects4J: A database of existing faults to enable controlled testing studies for Java programs,” in Proc. Int. Symp. Softw. Test. Anal., New York, NY, USA, 2014, pp. 437–440.
[33]
M. Papadakis, D. Shin, S. Yoo, and D.-H. Bae, “Are mutation scores correlated with real fault detection? A large scale empirical study on the relationship between mutants and real faults,” in Proc. 40th Int. Conf. Softw. Eng., New York, NY, USA, 2018, pp. 537–548.
[34]
O. Maler and D. Nickovic, “Monitoring temporal properties of continuous signals,” in Proc. Formal Techn. Model. Anal. Timed Fault-Tolerant Syst. Joint Int. Conf. Formal Model. Anal. Timed Syst. (FORMATS) Formal Techn. Real-Time Fault-Tolerant Syst. (FTRTFT), 2004, pp. 152–166.
[35]
L. Kocsis and C. Szepesvári, “Bandit based Monte-Carlo planning,” in Proc. 17th Eur. Conf. Mach. Learn. (ECML), 2006, pp. 282–293.
[36]
G. Ernstet al., “ARCH-COMP 2019 category report: Falsification,” in Proc. 6th Int. Workshop Appl. Verif. Continuous Hybrid Syst. (ARCH), vol. 61, 2019, pp. 129–140.
[37]
G. Ernstet al., “ARCH-COMP 2020 category report: Falsification,” in Proc. 7th Int. Workshop Appl. Verif. Continuous Hybrid Syst. (ARCH), vol. 74, 2020, pp. 140–152.
[38]
P. Heidlauf, A. Collins, M. Bolender, and S. Bak, “Verification challenges in F-16 ground collision avoidance and other automated maneuvers,” in Proc. 5th Int. Workshop Appl. Verif. Continuous Hybrid Syst. ARCH@ADHS, vol. 54. Oxford, U.K., Jul. 2018, pp. 208–217.
[39]
B. Hoxha, H. Abbas, and G. Fainekos, “Benchmarks for temporal logic requirements for automotive systems,” in Proc. ARCH14-15 1st 2nd Int. Workshop Appl. verif. Continuous Hybrid Syst., vol. 34, 2015, pp. 25–30.
[40]
X. Jin, J. V. Deshmukh, J. Kapinski, K. Ueda, and K. Butts, “Powertrain control verification benchmark,” in Proc. 17th Int. Conf. Hybrid Syst. Comput. Control, New York, NY, USA, 2014, pp. 253–262.
[41]
J. Deshmukh, M. Horvat, X. Jin, R. Majumdar, and V. S. Prabhu, “Testing cyber-physical systems through Bayesian optimization,” ACM Trans. Embedded Comput. Syst., vol. 16, no. 5s, pp. 1–18, Sep. 2017.
[42]
A. Auger and N. Hansen, “A restart CMA evolution strategy with increasing population size,” in Proc. IEEE Congr. Evol. Comput. (CEC), 2005, pp. 1769–1776.
[43]
T. Takisaka, Z. Zhang, P. Arcaini, and I. Hasuo, “Code and experimental results for the paper ‘a robustness-based confidence measure for hybrid system falsification,”’ 2022. [Online]. Available: https://github.com/choshina/coverage-confidence
[44]
C. Wohlin, P. Runeson, M. Höst, M. C. Ohlsson, B. Regnell, and A. Wesslén, Experimentation in Software Engineering. Berlin, Germany: Springer Publ. Company, Incorp., 2012.

Recommendations

Comments

Information & Contributors

Information

Published In

cover image IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems  Volume 42, Issue 5
May 2023
352 pages

Publisher

IEEE Press

Publication History

Published: 01 May 2023

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 0
    Total Downloads
  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 13 Jan 2025

Other Metrics

Citations

View Options

View options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media