skip to main content
research-article

Do Automatically Generated Test Cases Make Debugging Easier? An Experimental Assessment of Debugging Effectiveness and Efficiency

Published: 02 December 2015 Publication History

Abstract

Several techniques and tools have been proposed for the automatic generation of test cases. Usually, these tools are evaluated in terms of fault-revealing or coverage capability, but their impact on the manual debugging activity is not considered. The question is whether automatically generated test cases are equally effective in supporting debugging as manually written tests.
We conducted a family of three experiments (five replications) with humans (in total, 55 subjects) to assess whether the features of automatically generated test cases, which make them less readable and understandable (e.g., unclear test scenarios, meaningless identifiers), have an impact on the effectiveness and efficiency of debugging. The first two experiments compare different test case generation tools (Randoop vs. EvoSuite). The third experiment investigates the role of code identifiers in test cases (obfuscated vs. original identifiers), since a major difference between manual and automatically generated test cases is that the latter contain meaningless (obfuscated) identifiers.
We show that automatically generated test cases are as useful for debugging as manual test cases. Furthermore, we find that, for less experienced developers, automatic tests are more useful on average due to their lower static and dynamic complexity.

Supplementary Material

a5-ceccato-apndx.pdf (ceccato.zip)
Supplemental movie, appendix, image and software files for, Do Automatically Generated Test Cases Make Debugging Easier? An Experimental Assessment of Debugging Effectiveness and Efficiency

References

[1]
J. H. Andrews, L. C. Briand, and Y. Labiche. 2005. Is mutation an appropriate tool for testing experiments? In Proceedings of the 27th International Conference on Software Engineering (ICSE'05). 402--411.
[2]
J. H. Andrews, A. Groce, M. Weston, and R.-G. Xu. 2008. Random test run length and effectiveness. In Proceedings of the 23rd IEEE/ACM International Conference on Automated Software Engineering (ASE'08). IEEE Computer Society, 19--28.
[3]
S. Artzi, J. Dolby, F. Tip, and M. Pistoia. 2010. Directed test generation for effective fault localization. In Proceedings of the 19th International Symposium on Software Testing and Analysis (ISSTA'10). ACM Press, New York, 49--60.
[4]
N. E. Beckman, A. V. Nori, S. K. Rajamani, R. J. Simmons, S. D. Tetali, and A. V. Thakur. 2010. Proofs from tests. IEEE Trans. Softw. Engin. 36, 495--508.
[5]
J. Burnim and K. Sen. 2008. Heuristics for scalable dynamic test generation. In Proceedings of the IEEE/ACM International Conference on Automated Software Engineering (ASE'08). 443--446.
[6]
C. Cadar, V. Ganesh, P. M. Pawlowski, D. L. Dill, and D. R. Engler. 2008. EXE: Automatically generating inputs of death. ACM Trans. Inf. Syst. Secur. 12, 2.
[7]
M. Ceccato, A. Marchetto, L. Mariani, C. D. Nguyen, and P. Tonella. 2012. An empirical study about the effectiveness of debugging when random test cases are used. In Proceedings of the 34th International Conference on Software Engineering (ICSE'12). 452--462.
[8]
M. Ceccato, C. D. Nguyen, A. Marchetto, L. Mariani, and P. Tonella. 2013. A family of experiments to assess the impact of automated test case generation on the accuracy and efficiency of debugging, data analysis of five replications. Tech. rep. FBK, TR-FBK-SE-2013-2. https://se.fbk.eu/technical-reports.
[9]
I. Ciupa, A. Leitner, M. Oriol, and B. Meyer. 2007. Experimental assessment of random testing for object-oriented software. In Proceedings of the International Symposium on Software Testing and Analysis (ISSTA'07). ACM Press, New York, 84--94.
[10]
T. M. Cover and P. E. Hart. 1967. Nearest neighbor pattern classification. IEEE Trans. Inf. Theory 13, 1, 21--27.
[11]
H. Do, S. G. Elbaum, and G. Rothermel, G. 2005. Supporting controlled experimentation with testing techniques: An infrastructure and its potential impact. Empir. Softw. Engin. 10, 4, 405--435.
[12]
J. W. Duran. 1984. An evaluation of random testing. IEEE Trans. Softw. Engin. 4, 438--444.
[13]
P. G. Frankl and S. N. Weiss. 1991. An experimental comparison of the effectiveness of the all-uses and all-edges adequacy criteria. In Proceedings of the Symposium on Testing, Analysis, and Verification (TAV4'91). ACM Press, New York, 154--164.
[14]
G. Fraser and A. Arcuri. 2011. Evolutionary generation of whole test suites. In Proceedings of the 11th International Conference on Quality Software (QSIC'11). 31--40.
[15]
G. Fraser and A. Zeller. 2010. Mutation-driven generation of unit tests and oracles. In Proceedings of the 19th International Symposium on Software Testing and Analysis (ISSTA'10). ACM Press, New York, 147--158.
[16]
Z. P. Fry and W. Weimer. 2010. A human study of fault localization accuracy. In Proceedings of the IEEE International Conference on Software Maintenance (ICSM'10). IEEE Computer Society, 1--10.
[17]
P. Godefroid, N. Klarlund, and K. Sen. 2005. DART: Directed automated random testing. In Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI'05). ACM Press, New York, 213--223.
[18]
P. Godefroid, M. Y. Levin, and D. A. Molnar. 2008. Automated whitebox fuzz testing. In Proceedings of the Network and Distributed System Security Symposium (NDSS'08).
[19]
S. Holm. 1979. A simple sequentially rejective multiple test procedure. Scandinav. J. Statist. 6, 2, 65--70.
[20]
L. Huang and M. Holcombe. 2009. Empirical investigation towards the effectiveness of test first programming. Inf. Softw. Technol. 51, 1, 182--194.
[21]
J. Itkonen, M. V. Mantyla, and C. Lassenius. 2009. How do testers do it? An exploratory study on manual testing practices. In Proceedings of the 3rd International Symposium on Empirical Software Engineering and Measurement (ESEM'09). IEEE Computer Society, 494--497.
[22]
J. A. Jones, M. J. Harrold, and J. Stasko. 2002. Visualization of test information to assist fault localization. In Proceedings of the International Conference on Software Engineering (ICSE'02). ACM Press, New York, 467--477.
[23]
A. N. Oppenheim. 1992. Questionnaire Design, Interviewing and Attitude Measurement. Pinter, London.
[24]
C. Pacheco and M. D. Ernst. 2007. Randoop: Feedback-directed random testing for Java. In Companion to the 22nd ACM SIGPLAN Conference on Object-Oriented Programming Systems and Applications Companion (OOPSLA'07). ACM Press, New York, 815--816.
[25]
C. Parnin and A. Orso. 2011. Are automated debugging techniques actually helping programmers? In Proceedings of the International Symposium on Software Testing and Analysis (ISSTA'11). ACM Press, New York, 199--209.
[26]
F. Ricca, M. Torchiano, M. Di Penta, M. Ceccato, and P. Tonella. 2009. Using acceptance tests as a support for clarifying requirements: A series of experiments. Inf. Softw. Technol. 51, 2, 270--283.
[27]
J. R. Ruthruff, M. Burnett, and G. Rothermel. 2005. An empirical study of fault localization for end-user programmers. In Proceedings of the 27th International Conference on Software Engineering (ICSE'05). ACM Press, New York, 352--361.
[28]
K. Sen, D. Marinov, and G. Agha. 2005. CUTE: A concolic unit testing engine for C. SIGSOFT Softw. Engin. Notes 30, 5, 263--272.
[29]
N. Tillmann and J. De Halleux. 2008. Pex: White box test generation for .NET. In Proceedings of the 2nd International Conference on Tests and Proofs (TAP'08). Springer, 134--153.
[30]
P. Tonella. 2004. Evolutionary testing of classes. In Proceedings of the ACM/SIGSOFT International Symposium on Software Testing and Analysis (ISSTA'04). 119--128.
[31]
C. J. Van Rijsbergen. 1979. Information Retrieval, 2nd ed. Butterworths, London.
[32]
M. Weiser and J. Lyle. 1986. Experiments on slicing-based debugging aids. In Proceedings of the 1st Workshop on Empirical Studies of Programmers. Ablex Publishing, Norwood, NJ, 187--197.
[33]
C. Wohlin, P. Runeson, M. Host, M. C. Ohlsson, B. Regnell, and A. Wesslen. 2012. Experimentation in Software Engineering. Springer.
[34]
Y. Yu, J. A. Jones, and M. J. Harrold. 2008. An empirical study of the effects of test-suite reduction on fault localization. In Proceedings of the 30th International Conference on Software Engineering (ICSE'08). ACM Press, New York, 201--210.
[35]
A. Zeller and R. Hildebrandt. 2002. Simplifying and isolating failure-inducing input. IEEE Trans. Softw. Engin. 28, 2, 183--200.

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Software Engineering and Methodology
ACM Transactions on Software Engineering and Methodology  Volume 25, Issue 1
December 2015
339 pages
ISSN:1049-331X
EISSN:1557-7392
DOI:10.1145/2852270
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 02 December 2015
Accepted: 01 April 2015
Revised: 01 November 2014
Received: 01 May 2014
Published in TOSEM Volume 25, Issue 1

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Empirical software engineering
  2. automatic test case generation
  3. debugging

Qualifiers

  • Research-article
  • Research
  • Refereed

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)41
  • Downloads (Last 6 weeks)6
Reflects downloads up to 06 Jan 2025

Other Metrics

Citations

Cited By

View all

View Options

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media