skip to main content
10.5555/2818754.2818816acmconferencesArticle/Chapter ViewAbstractPublication PagesicseConference Proceedingsconference-collections
research-article

No PAIN, no gain?: the utility of PArallel fault INjections

Published: 16 May 2015 Publication History

Abstract

Software Fault Injection (SFI) is an established technique for assessing the robustness of a software under test by exposing it to faults in its operational environment. Depending on the complexity of this operational environment, the complexity of the software under test, and the number and type of faults, a thorough SFI assessment can entail (a) numerous experiments and (b) long experiment run times, which both contribute to a considerable execution time for the tests.
In order to counteract this increase when dealing with complex systems, recent works propose to exploit parallel hardware to execute multiple experiments at the same time. While PArallel fault INjections (PAIN) yield higher experiment throughput, they are based on an implicit assumption of non-interference among the simultaneously executing experiments. In this paper we investigate the validity of this assumption and determine the trade-off between increased throughput and the accuracy of experimental results obtained from PAIN experiments.

References

[1]
J. Voas, F. Charron, G. McGraw, K. Miller, and M. Friedman, "Predicting How Badly "Good" Software Can Behave," IEEE Softw., vol. 14, no. 4, pp. 73--83, 1997.
[2]
J. Durães and H. Madeira, "Emulation of Software faults: A Field Data Study and a Practical Approach," IEEE Trans. Softw. Eng., vol. 32, no. 11, pp. 849--867, 2006.
[3]
D. Cotroneo and R. Natella, "Fault Injection for Software Certification," IEEE Security Privacy, vol. 11, no. 4, pp. 38--45, 2013.
[4]
P. Koopman and J. DeVale, "The Exception Handling Effectiveness of POSIX Operating Systems," IEEE Trans. Softw. Eng., vol. 26, no. 9, pp. 837--848, 2000.
[5]
J. Arlat, J. Fabre, M. Rodríguez, and F. Salles, "Dependability of COTS Microkernel-Based Systems," IEEE Trans. Comput., vol. 51, no. 2, pp. 138--163, 2002.
[6]
D. Di Leo, F. Ayatolahi, B. Sangchoolie, J. Karlsson, and R. Johansson, "On the Impact of Hardware Faults--An Investigation of the Relationship between Workload Inputs and Failure Mode Distributions," in Proc. SAFECOMP'12, 2012, pp. 198--209.
[7]
R. Natella, D. Cotroneo, J. Durães, and H. Madeira, "On Fault Representativeness of Software Fault Injection," IEEE Trans. Softw. Eng., vol. 39, no. 1, pp. 80--96, Jan. 2013.
[8]
H. S. Gunawi, T. Do, P. Joshi, P. Alvaro, J. M. Hellerstein, A. C. Arpaci-Dusseau, R. H. Arpaci-Dusseau, K. Sen, and D. Borthakur, "FATE and DESTINI: A Framework for Cloud Recovery Testing," in Proc. NSDI'11, 2011.
[9]
P. Joshi, H. S. Gunawi, and K. Sen, "PREFAIL: A Programmable Tool for Multiple-failure Injection," in Proc. OOP-SLA'11, 2011, pp. 171--188.
[10]
A. Lanzaro, R. Natella, S. Winter, D. Cotroneo, and N. Suri, "An Empirical Study of Injected versus Actual Interface Errors," in Proc. ISSTA'14, 2014, pp. 397--408.
[11]
S. Winter, M. Tretter, B. Sattler, and N. Suri, "simFI: From single to simultaneous software fault injections," in Proc. DSN'13, Jun. 2013, pp. 1--12.
[12]
Y. Jia and M. Harman, "Higher Order Mutation Testing," Information and Software Technology, vol. 51, no. 10, pp. 1379--1393, 2009.
[13]
M. Papadakis and N. Malevris, "An empirical evaluation of the first and second order mutation testing strategies," in Proc. ICSTW'10, 2010, pp. 90--99.
[14]
Y. Jia and M. Harman, "Constructing Subtle Faults Using Higher Order Mutation Testing," in Proc. SCAM'08, Sep. 2008, pp. 249--258.
[15]
A. Siami Namin, J. H. Andrews, and D. J. Murdoch, "Sufficient Mutation Operators for Measuring Test Effectiveness," in Proc. ICSE'08, 2008, pp. 351--360.
[16]
A. Lastovetsky, "Parallel testing of distributed software," Information and Software Technology, vol. 47, no. 10, pp. 657--662, 2005.
[17]
A. Duarte, W. Cirne, F. Brasileiro, and P. Machado, "GridUnit: Software Testing on the Grid," in Proc. ICSE'06, 2006, pp. 779--782.
[18]
M. Oriol and F. Ullah, "YETI on the Cloud," in Proc. ICSTW'10, Apr. 2010, pp. 434--437.
[19]
D. Cotroneo, M. Grottke, R. Natella, R. Pietrantuono, and K. S. Trivedi, "Fault Triggers in Open-Source Software: An Experience Report," in Proc. ISSRE'13, 2013, pp. 178--187.
[20]
W. T. Ng and P. M. Chen, "The design and verification of the rio file cache," IEEE Trans. Comput., vol. 50, no. 4, pp. 322--337, 2001.
[21]
M. M. Swift, B. N. Bershad, and H. M. Levy, "Improving the reliability of commodity operating systems," in Proc. SOSP'03, 2003, pp. 207--222.
[22]
J. Durães and H. Madeira, "Multidimensional characterization of the impact of faulty drivers on the operating systems behavior," IEICE Transactions on Information and Systems, vol. 86, no. 12, pp. 2563--2570, 2003.
[23]
J. Durães, M. Vieira, and H. Madeira, "Dependability Benchmarking of Web-Servers," in Computer Safety, Reliability, and Security, ser. Lecture Notes in Computer Science, vol. 3219, 2004, pp. 297--310.
[24]
M. Vieira and H. Madeira, "A dependability benchmark for OLTP application environments," in Proc. VLDB'03, 2003, pp. 742--753.
[25]
A. Albinet, J. Arlat, and J.-C. Fabre, "Characterization of the impact of faulty drivers on the robustness of the linux kernel," in Proc. DSN'04, 2004, pp. 867--876.
[26]
A. Bondavalli, S. Chiaradonna, D. Cotroneo, and L. Romano, "Effective fault treatment for improving the dependability of COTS and legacy-based applications," IEEE Trans. Dependable Secure Comput., vol. 1, no. 4, pp. 223--237, 2004.
[27]
A. Bondavalli, A. Ceccarelli, L. Falai, and M. Vadursi, "Foundations of measurement theory applied to the evaluation of dependability attributes," in Proc. DSN'07, 2007, pp. 522--533.
[28]
J. Carreira, H. Madeira, and J. G. Silva, "Xception: a technique for the experimental evaluation of dependability in modern computers," IEEE Trans. Softw. Eng., vol. 24, no. 2, pp. 125--136, 1998.
[29]
J. Aidemark, J. Vinter, P. Folkesson, and J. Karlsson, "GOOFI: Generic Object-Oriented Fault Injection Tool," in Proc. DSN'01, 2001, pp. 83--88.
[30]
D. Stott, B. Floering, Z. Kalbarczyk, and R. Iyer, "A Framework for Assessing Dependability in Distributed Systems with Lightweight Fault Injectors," in Proc. IPDS'00, 2000, pp. 91--100.
[31]
D. Skarin, R. Barbosa, and J. Karlsson, "Comparing and validating measurements of dependability attributes," in Proc. EDCC'10, 2010, pp. 3--12.
[32]
E. van der Kouwe, C. Giuffrida, and A. S. Tanenbaum, "Evaluating Distortion in Fault Injection Experiments," in Proc. HASE'14, 2014.
[33]
R. Chandra, R. M. Lefever, K. R. Joshi, M. Cukier, and W. H. Sanders, "A global-state-triggered fault injector for distributed system evaluation," IEEE Trans. Parallel Distrib. Syst., vol. 15, no. 7, pp. 593--605, 2004.
[34]
D. Cotroneo, R. Natella, S. Russo, and F. Scippacercola, "State-Driven Testing of Distributed Systems," in Proc. OPODIS'13, 2013, pp. 114--128.
[35]
I. Irrera, J. Durães, H. Madeira, and M. Vieira, "Assessing the Impact of Virtualization on the Generation of Failure Prediction Data," in Proc. LADC'13, 2013, pp. 92--97.
[36]
E. Starkloff, "Designing a parallel, distributed test system," in Proc. AUTOTESTCON'00, 2000, pp. 564--567.
[37]
G. M. Kapfhammer, "Automatically and Transparently Distributing the Execution of Regression Test Suites," in Proc. ICTCS'01, 2001.
[38]
A. N. Duarte, W. Cirne, F. Brasileiro, and P. Duarte De Lima Machado, "Using the Computational Grid to Speed up Software Testing," in Proc. SBES'05, 2005.
[39]
A. Duarte, G. Wagner, F. Brasileiro, and W. Cirne, "Multienvironment Software Testing on the Grid," in Proc. PAD-TAD'06, 2006, pp. 61--68.
[40]
T. Parveen, S. Tilley, N. Daley, and P. Morales, "Towards a distributed execution framework for JUnit test cases," in Proc. ICSM'09, Sep. 2009, pp. 425--428.
[41]
L. Yu, L. Zhang, H. Xiang, Y. Su, W. Zhao, and J. Zhu, "A Framework of Testing as a Service," in Proc. MASS'09, Sep. 2009, pp. 1--4.
[42]
L. Yu, W.-T. Tsai, X. Chen, L. Liu, Y. Zhao, L. Tang, and W. Zhao, "Testing as a Service over Cloud," in Proc. SOSE'10, Jun. 2010, pp. 181--188.
[43]
M. Staats and C. Păsăreanu, "Parallel Symbolic Execution for Structural Test Generation," in Proc. ISSTA'10, 2010, pp. 183--194.
[44]
G. Candea, S. Bucur, and C. Zamfir, "Automated Software Testing as a Service," in Proc. SoCC'10, 2010, pp. 155--160.
[45]
L. Ciortea, C. Zamfir, S. Bucur, V. Chipounov, and G. Candea, "Cloud9: A Software Testing Service," SIGOPS Oper. Syst. Rev., vol. 43, no. 4, pp. 5--10, Jan. 2010.
[46]
R. Mahmood, N. Esfahani, T. Kacem, N. Mirzaei, S. Malek, and A. Stavrou, "A whitebox approach for automated security testing of Android applications on the cloud," in Proc. AST'12, 2012, pp. 22--28.
[47]
J. Gray, "Why do computers stop and what can be done about it?" Tandem Computers, Tech. Rep. TR-85.7, 1986.
[48]
M. Grottke and K. Trivedi, "Fighting Bugs: Remove, Retry, Replicate, and Rejuvenate," IEEE Computer, vol. 40, no. 2, pp. 107--109, 2007.
[49]
T. Banzai, H. Koizumi, R. Kanbayashi, T. Imada, T. Hanawa, and M. Sato, "D-Cloud: Design of a Software Testing Environment for Reliable Distributed Systems Using Cloud Computing Technology," in Proc. CCGRID'10, 2010, pp. 631--636.
[50]
T. Hanawa, T. Banzai, H. Koizumi, R. Kanbayashi, T. Imada, and M. Sato, "Large-Scale Software Testing Environment Using Cloud Computing Technology for Dependable Parallel and Distributed Systems," in Proc. ICSTW'10, Apr. 2010, pp. 428--433.
[51]
R. Banabic and G. Candea, "Fast black-box testing of system recovery code," in Proc. EuroSys'12, 2012, pp. 281--294.
[52]
D. Gupta, L. Cherkasova, R. Gardner, and A. Vahdat, "Enforcing performance isolation across virtual machines in xen," in Proc. Middleware'06, 2006, pp. 342--362.
[53]
G. Somani and S. Chaudhary, "Application performance isolation in virtualization," in Proc. CLOUD'09, 2009, pp. 41--48.
[54]
Q. Huang and P. P. Lee, "An experimental study of cascading performance interference in a virtualized environment," ACM SIGMETRICS Performance Evaluation Review, vol. 40, no. 4, pp. 43--52, 2013.
[55]
D. Novaković, N. Vasić, S. Novaković, D. Kostić, and R. Bianchini, "DeepDive: Transparently Identifying and Managing Performance Interference in Virtualized Environments," in Proc. USENIX ATC'13, 2013, pp. 219--230.
[56]
A. Chou, J. Yang, B. Chelf, S. Hallem, and D. Engler, "An Empirical Study of Operating Systems Errors," in Proc. SOSP'01, 2001, pp. 73--88.
[57]
N. Palix, G. Thomas, S. Saha, C. Calvès, J. Lawall, and G. Muller, "Faults in linux: ten years later," in Proc. ASPLOS'11, 2011, pp. 305--318.
[58]
D. Simpson, Windows XP Embedded with Service Pack 1 Reliability. {Online}. Available: http://msdn.microsoft.com/en-us/library/ms838661(WinEmbedded.5).aspx.
[59]
A. Ganapathi, V. Ganapathi, and D. Patterson, "Windows XP Kernel Crash Analysis," in Proc. LISA'06, 2006, pp. 12--22.
[60]
D. Cotroneo, R. Natella, and S. Russo, "Assessment and Improvement of Hang Detection in the Linux Operating System," in Proc. SRDS'09, Sep. 2009, pp. 288--294.
[61]
A. Bovenzi, M. Cinque, D. Cotroneo, R. Natella, and G. Carrozza, "OS-Level Hang Detection in Complex Software Systems," Int. J. Crit. Comput.-Based Syst., vol. 2, no. 3/4, pp. 352--377, Sep. 2011.
[62]
Y. Zhu, Y. Li, J. Xue, T. Tan, J. Shi, Y. Shen, and C. Ma, "What Is System Hang and How to Handle It," in Proc. ISSRE'12, 2012, pp. 141--150.
[63]
DEEDS/TUD and Mobilab/UniNa, PAIN Software Framework, https://github.com/DEEDS-TUD/PAIN.git.
[64]
J. Christmansson and R. Chillarege, "Generation of an Error Set that Emulates Software Faults based on Field Data," in FTCS, 1996, pp. 304--313.
[65]
Google Inc., Android -- Discover Android. {Online}. Available: http://www.android.com/about/.
[66]
Google Inc., android Git repositories. {Online}. Available: https://android.googlesource.com/.
[67]
Google Inc., Android Emulator. {Online}. Available: http://developer.android.com/tools/help/emulator.html.
[68]
F. Bellard, Qemu. {Online}. Available: http://wiki.qemu.org/Main_Page.
[69]
R. Longbottom, Roy Longbottom's Android Benchmark Apps. {Online}. Available: http://www.roylongbottom.org.uk/android%20benchmarks.htm.
[70]
Y. Benjamini and Y. Hochberg, "Controlling the False Discovery Rate: A Practical and Powerful Approach to Multiple Testing," Journal of the Royal Statistical Society. Series B (Methodological), vol. 57, no. 1, pp. 289--300, 1995.
[71]
J. Andrews, L. Briand, and Y. Labiche, "Is mutation an appropriate tool for testing experiments?" In Proc. ICSE'05, 2005, pp. 402--411.
[72]
H. Do and G. Rothermel, "On the use of mutation faults in empirical assessments of test case prioritization techniques," IEEE Trans. Softw. Eng., pp. 733--752, 2006.
[73]
J. Durães and H. Madeira, "Emulation of Software faults: A Field Data Study and a Practical Approach," IEEE Trans. Softw. Eng., vol. 32, no. 11, pp. 849--867, 2006.
[74]
R. Natella, D. Cotroneo, J. A. Durães, and H. S. Madeira, "On Fault Representativeness of Software Fault Injection," IEEE Trans. Softw. Eng., vol. 39, no. 1, pp. 80--96, 2013.
[75]
K. Kanoun and L. Spainhower, Dependability Benchmarking for Computer Systems. Wiley-IEEE Computer Society, 2008.
[76]
T. Tsai, M. Hsueh, H. Zhao, Z. Kalbarczyk, and R. Iyer, "Stress-based and path-based fault injection," IEEE Trans. on Computers, vol. 48, no. 11, pp. 1183--1201, 1999.
[77]
A. Avizienis, J. Laprie, B. Randell, and C. Landwehr, "Basic Concepts and Taxonomy of Dependable and Secure Computing," IEEE Trans. Dependable Secure Comput., vol. 1, no. 1, pp. 11--33, 2004.

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
ICSE '15: Proceedings of the 37th International Conference on Software Engineering - Volume 1
May 2015
999 pages
ISBN:9781479919345

Sponsors

Publisher

IEEE Press

Publication History

Published: 16 May 2015

Check for updates

Qualifiers

  • Research-article

Conference

ICSE '15
Sponsor:

Acceptance Rates

Overall Acceptance Rate 276 of 1,856 submissions, 15%

Upcoming Conference

ICSE 2025

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)4
  • Downloads (Last 6 weeks)1
Reflects downloads up to 08 Feb 2025

Other Metrics

Citations

Cited By

View all

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media