skip to main content
10.1145/3624032.3624038acmotherconferencesArticle/Chapter ViewAbstractPublication PagessastConference Proceedingsconference-collections
research-article

Test Data Selection Based on Applying Mutation Testing to Decision Tree Models

Published: 17 October 2023 Publication History

Abstract

Software testing is crucial to ensure software quality, verifying that it behaves as expected. This activity plays a crucial role in identifying defects from the early stages of the development process. Software testing is especially essential in complex or critical systems, such as those using Machine Learning (ML) techniques, since the models can present uncertainties and errors that affect their reliability. This work investigates the use of mutation testing to support the validation of ML applications. Our approach involves applying mutation analysis to the decision tree structure. The resulting mutated trees are a reference for selecting a test dataset that can effectively identify incorrect classifications in machine learning models. Preliminary results suggest that the proposed approach can successfully improve the test data selection for ML applications.

References

[1]
Hiralal Agrawal, Richard A. Demillo, Bob Hathaway, William Hsu, Wynne Hsu, E. W. Krauser, R. J. Martin, Aditya P. Mathur, and Eugene H. Spafford. 1989. Design Of Mutant Operators For The C Programming Language. W. Lafayette, IN 47907, Software Engineering Research Center Department of Computer Sciences Purdue University.
[2]
Paul Ammann and Jeff Offutt. 2016. Introduction to software testing. Cambridge University Press.
[3]
Maurício Aniche, Erick Maziero, Rafael Durelli, and Vinicius H. S. Durelli. 2022. The Effectiveness of Supervised Machine Learning Algorithms in Predicting Software Refactoring. IEEE Transactions on Software Engineering 48, 4 (2022), 1432–1450.
[4]
Houssem Ben Braiek and Foutse Khomh. 2020. On Testing Machine Learning Programs. Journal of Systems and Software 164 (2020), 110542.
[5]
R.A. DeMillo, R.J. Lipton, and F.G. Sayward. 1978. Hints on Test Data Selection: Help for the Practicing Programmer. Computer 11, 4 (1978), 34–41.
[6]
V. H. S. Durelli, R. S. Durelli, S. S. Borges, A. T. Endo, M. M. Eler, D. R. C. Dias, and M. P. Guimarães. 2019. Machine Learning Applied to Software Testing: A Systematic Mapping Study.
[7]
R. A. Fisher. 1936. The Use of Multiple Measurements in Taxonomic Problems. Annals of Eugenics 7, 2 (1936), 179–188.
[8]
Aurélien Géron. 2019. Hands-On Machine Learning with Scikit-Learn, Keras, and Tensorflow: Concepts, Tools, and Techniques to Build Intelligent Systems (2nd ed.). O’Reilly. 856 pages.
[9]
Q. Hu, L. Ma, X. Xie, B. Yu, Y. Liu, and J. Zhao. 2019. DeepMutation++: A Mutation Testing Framework for Deep Learning Systems. In 34th IEEE/ACM International Conference on Automated Software Engineering (ASE). 1157–1161.
[10]
Nargiz Humbatova, Gunel Jahangirova, and Paolo Tonella. 2021. DeepCrime: Mutation Testing of Deep Learning Systems Based on Real Faults. In Proceedings of the 30th ACM SIGSOFT International Symposium on Software Testing and Analysis(ISSTA 2021). Association for Computing Machinery.
[11]
Gunel Jahangirova and Paolo Tonella. 2020. An Empirical Evaluation of Mutation Operators for Deep Learning Systems. In 2020 IEEE 13th International Conference on Software Testing, Validation and Verification (ICST). 74–84.
[12]
Gareth James, Daniela Witten, Trevor Hastie, and Robert Tibshirani. 2013. An Introduction to Statistical Learning: with Applications in R. Springer Texts in Statistics. 426 pages.
[13]
Yue Jia and Mark Harman. 2010. An analysis and survey of the development of mutation testing. IEEE transactions on software engineering 37, 5 (2010), 649–678.
[14]
Yuteng Lu, Kaicheng Shao, Weidi Sun, and Meng Sun. 2022. MTUL: Towards Mutation Testing of Unsupervised Learning Systems. In Dependable Software Engineering. Theories, Tools, and Applications, Wei Dong and Jean-Pierre Talpin (Eds.). Springer Nature Switzerland, Cham, 22–40.
[15]
Lei Ma, Fuyuan Zhang, Jiyuan Sun, Minhui Xue, Bo Li, Felix Juefei-Xu, Chao Xie, Li Li, Yang Liu, Jianjun Zhao, and Yadong Wang. 2018. DeepMutation: Mutation Testing of Deep Learning Systems. In 2018 IEEE 29th International Symposium on Software Reliability Engineering (ISSRE). 100–111.
[16]
A. C. Müller and Sarah Guido. 2016. Introduction to Machine Learning with Python: A Guide for Data Scientists. O’Reilly Media. 400 pages.
[17]
Annibale Panichella and Cynthia C. S. Liem. 2021. What Are We Really Testing in Mutation Testing for Machine Learning? A Critical Reflection. In 2021 IEEE/ACM 43rd International Conference on Software Engineering: New Ideas and Emerging Results (ICSE-NIER). 66–70.
[18]
Vincenzo Riccio, Nargiz Humbatova, Gunel Jahangirova, and Paolo Tonella. 2022. DeepMetis: Augmenting a Deep Learning Test Set to Increase Its Mutation Score. In Proceedings of the 36th IEEE/ACM International Conference on Automated Software Engineering(ASE ’21). IEEE Press, 355–367.
[19]
Sebastião Santos, Beatriz Silveira, Vinicius Durelli, Rafael Durelli, Simone Souza, and Marcio Delamaro. 2021. On Using Decision Tree Coverage Criteria for Testing Machine Learning Models. In Proceedings of the 6th Brazilian Symposium on Systematic and Automated Software Testing. 1–9.
[20]
Weijun Shen, Jun Wan, and Zhenyu Chen. 2018. MuNN: Mutation Analysis of Neural Networks. In 2018 IEEE International Conference on Software Quality, Reliability and Security Companion (QRS-C). 108–115.
[21]
Florian Tambon, Foutse Khomh, and Giuliano Antoniol. 2023. A probabilistic framework for mutation testing in deep neural networks. Information and Software Technology 155 (2023), 107–129.
[22]
C. Wohlin, P. Runeson, M. Höst, M. C. Ohlsson, B. Regnell, and A. Wesslén. 2012. Experimentation in Software Engineering. Springer. 236 pages.
[23]
J. M. Zhang, M. Harman, L. Ma, and Y. Liu. 2020. Machine Learning Testing: Survey, Landscapes and Horizons. IEEE Transactions on Software Engineering (Early Access) (2020), 1–37.

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences
SAST '23: Proceedings of the 8th Brazilian Symposium on Systematic and Automated Software Testing
September 2023
133 pages
ISBN:9798400716294
DOI:10.1145/3624032
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 17 October 2023

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Decision Tree
  2. Machine Learning
  3. Mutation Testing
  4. Software Testing

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Funding Sources

  • Conselho Nacional de Desenvolvimento Científico e Tecnologico
  • Fundacao de Amparo à Pesquisa do Estado de São Paulo
  • Conselho Nacional de Desenvolvimento Científico e Tecnológico

Conference

SAST 2023

Acceptance Rates

Overall Acceptance Rate 45 of 92 submissions, 49%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 54
    Total Downloads
  • Downloads (Last 12 months)27
  • Downloads (Last 6 weeks)1
Reflects downloads up to 26 Jan 2025

Other Metrics

Citations

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media