skip to main content
10.1145/3644032.3644467acmconferencesArticle/Chapter ViewAbstractPublication PagesicseConference Proceedingsconference-collections
research-article

Machine Learning-based Test Case Prioritization using Hyperparameter Optimization

Published: 10 June 2024 Publication History

Abstract

Continuous integration pipelines execute extensive automated test suites to validate new software builds. In this fast-paced development environment, delivering timely testing results to developers is critical to ensuring software quality. Test case prioritization (TCP) emerges as a pivotal solution, enabling the prioritization of fault-prone test cases for immediate attention. Recent advancements in machine learning have showcased promising results in TCP, offering the potential to revolutionize how we optimize testing workflows. Hyperparameter tuning plays a crucial role in enhancing the performance of ML models. However, there needs to be more work investigating the effects of hyperparameter tuning on TCP. Therefore, we explore how optimized hyperparameters influence the performance of various ML classifiers, focusing on the Average Percentage of Faults Detected (APFD) metric. Through empirical analysis of ten real-world, large-scale, diverse datasets, we conduct a grid search-based tuning with 885 hyperparameter combinations for four machine learning models. Our results provide model-specific insights and demonstrate an average 15% improvement in model performance with hyperparameter tuning compared to default settings. We further explain how hyperparameter tuning improves precision (max = 1), recall (max = 0.9633), F1-score (max = 0.9662), and influences APFD value (max = 0.9835), indicating a direct connection between tuning and prioritization performance. Hence, this study underscores the importance of hyperparameter tuning in optimizing failure prediction models and their direct impact on prioritization performance.

References

[1]
P. Probst, M. N. Wright, and A.-L. Boulesteix, "Hyperparameters and tuning strategies for random forest," Wiley Interdisciplinary Reviews: data mining and knowledge discovery, vol. 9, no. 3, p. e1301, 2019.
[2]
A. S. Tosun, A. Bener, and B. Turhan, "Selecting a few to get the most from many: A diversity-based active learning approach for software defect prediction," IEEE Transactions on Software Engineering, vol. 35, no. 5, pp. 709--722, 2009.
[3]
E. Kocaguneli, T. Menzies, and B. Caglayan, "Reducing configuration overload in software product lines," in Proceedings of the 16th International Software Product Line Conference, 2012, pp. 181--190.
[4]
I. K. M. Aydin and E. Akin, "A multi-objective artificial immune algorithm for parameter optimization in support vector machine," Applied Soft Computing, vol. 11, no. 1, pp. 120--129, 2011.
[5]
C. Tantithamthavorn, S. McIntosh, A. E. Hassan, and K. Matsumoto, "The impact of automated parameter optimization on defect prediction models," IEEE Transactions on Software Engineering, vol. 45, no. 7, pp. 683--771, Jul. 2019.
[6]
C. Tantithamthavorn, "Automated parameter optimization of classification techniques for defect prediction models," in Proceedings of the 38th international conference on software engineering, 2016, pp. 321--332.
[7]
H. Osman, M. Ghafari, and O. Nierstrasz, "Hyperparameter optimization to improve bug prediction accuracy," in Proc. IEEE Workshop Mach. Learn. Techn. Softw. Qual. Eval. (MaLTeSQuE), Feb. 2017, pp. 33--38.
[8]
A. Bertolino, A. Guerriero, B. Miranda, R. Pietrantuono, and S. Russo, "Learning-to-rank vs ranking-to-learn: Strategies for regression testing in continuous integration," in Proceedings of the ACM/IEEE 42nd International Conference on Software Engineering, 2020, pp. 1--12.
[9]
Y. Yang, Z. Li, L. He, and R. Zhao, "A systematic study of reward for reinforcement learning based continuous integration testing," Journal of Systems and Software, vol. 170, p. 110787, 2020.
[10]
J. Chen, Y. Lou, L. Zhang, J. Zhou, X. Wang, D. Hao, and L. Zhang, "Optimizing test prioritization via test distribution analysis," in Proceedings of the 2018 26th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, 2018, pp. 656--667.
[11]
A. Memon, Z. Gao, B. Nguyen, S. Dhanda, E. Nickell, R. Siemborski, and J. Micco, "Taming google-scale continuous testing," in 2017 IEEE/ACM 39th International Conference on Software Engineering: Software Engineering in Practice Track (ICSE-SEIP). IEEE, 2017, pp. 233--242.
[12]
W. Zhang, Y. Yang, J. Yang, Z. Lin, H. Zhang, Y. Zou, and H. Mei, "Evasive defects: Can developers predict their own code's defectiveness?" Empirical Software Engineering, vol. 22, no. 4, pp. 1872--1913, 2017.
[13]
E. Aydemir, P. Güler, A. Bener, and B. Turhan, "The effect of hyperparameter settings on transfer learning for automated program repair," in Proceedings of the 33rd ACM/IEEE International Conference on Automated Software Engineering, 2018, pp. 294--304.
[14]
N. Ramakrishnan, D. Verma, D. Gupta, U. O'Reilly, and M. Chetty, "Hybrid meta-heuristics for hyperparameter tuning: An empirical investigation," in International Conference on Artificial Intelligence and Soft Computing, 2020, pp. 609--622.
[15]
G. Rothermel, R. H. Untch, C. Chu, and M. J. Harrold, "Prioritizing test cases for regression testing," IEEE Transactions on software engineering, vol. 27, no. 10, pp. 929--948, 2001.
[16]
B. Bischl, M. Binder, M. Lang, T. Pielok, J. Richter, S. Coors, J. Thomas, T. Ullmann, M. Becker, A.-L. Boulesteix, D. Deng, and M. Lindauer, "Hyperparameter optimization: Foundations, algorithms, best practices, and open challenges," WIREs Data Mining and Knowledge Discovery, vol. 13, no. 2, p. e1484, 2023.
[17]
A. Agrawal, X. Yang, R. Agrawal, R. Yedida, X. Shen, and T. Menzies, "Simpler hyperparameter optimization for software analytics: Why, how, when?" IEEE Transactions on Software Engineering, vol. 48, no. 8, pp. 2939--2954, 2022.
[18]
A. S. Yaraghi, M. Bagherzadeh, N. Kahani, and L. C. Briand, "Scalable and accurate test case prioritization in continuous integration contexts," IEEE Transactions on Software Engineering, vol. 49, no. 4, pp. 1615--1639, 2022.
[19]
J. Mendoza, J. Mycroft, L. Milbury, N. Kahani, and J. Jaskolka, "On the effectiveness of data balancing techniques in the context of ml-based test case prioritization," in Proceedings of the 18th International Conference on Predictive Models and Data Analytics in Software Engineering, 2022, pp. 72--81.
[20]
D. Elsner, F. Hauer, A. Pretschner, and S. Reimer, "Empirically evaluating readily available information for regression test optimization in continuous integration," in Proceedings of the 30th ACM SIGSOFT International Symposium on Software Testing and Analysis, 2021, pp. 491--504.
[21]
R. Mamata, A. Azim, R. Liscano, K. Smith, Y.-K. Chang, G. Seferi, and Q. Tauseef, "Test case prioritization using transfer learning in continuous integration environments," in 2023 IEEE/ACM International Conference on Automation of Software Test (AST). IEEE, 2023, pp. 191--200.
[22]
T. Menzies, J. Greenwald, and A. Frank, "Data mining static code attributes to learn defect predictors," IEEE Transactions on Software Engineering, vol. 33, no. 1, pp. 2--13, 2007.
[23]
R. Malhotra, "A systematic review of machine learning techniques for software fault prediction," Applied Soft Computing, vol. 27, pp. 504--518, 2015.
[24]
D. Kühn, P. Probst, J. Thomas, and B. Bischl, "Automatic exploration of machine learning experiments on openml," arXiv preprint arXiv:1806.10961, 2018.
[25]
S. Elbaum, A. G. Malishevsky, and G. Rothermel, "Test case prioritization: A family of empirical studies," IEEE transactions on software engineering, vol. 28, no. 2, pp. 159--182, 2002.

Cited By

View all
  • (2024)An End-to-End Test Case Prioritization Framework using Optimized Machine Learning Models2024 IEEE International Conference on Software Testing, Verification and Validation Workshops (ICSTW)10.1109/ICSTW60967.2024.00014(1-8)Online publication date: 27-May-2024
  • (2024)On the Effectiveness of Feature Selection Techniques in the Context of ML-Based Regression Test PrioritizationIEEE Access10.1109/ACCESS.2024.345965612(131556-131575)Online publication date: 2024

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
AST '24: Proceedings of the 5th ACM/IEEE International Conference on Automation of Software Test (AST 2024)
April 2024
235 pages
ISBN:9798400705885
DOI:10.1145/3644032
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 10 June 2024

Check for updates

Author Tags

  1. hyperparameter optimization
  2. test case prioritization
  3. machine learning
  4. continuous integration

Qualifiers

  • Research-article

Conference

AST '24
Sponsor:

Upcoming Conference

ICSE 2025

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)89
  • Downloads (Last 6 weeks)16
Reflects downloads up to 31 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2024)An End-to-End Test Case Prioritization Framework using Optimized Machine Learning Models2024 IEEE International Conference on Software Testing, Verification and Validation Workshops (ICSTW)10.1109/ICSTW60967.2024.00014(1-8)Online publication date: 27-May-2024
  • (2024)On the Effectiveness of Feature Selection Techniques in the Context of ML-Based Regression Test PrioritizationIEEE Access10.1109/ACCESS.2024.345965612(131556-131575)Online publication date: 2024

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media