research-article

Usage of multiple prediction models based on defect categories

Authors:

Andriy Miranskyy,

Nuzio RuffoloAuthors Info & Claims

PROMISE '10: Proceedings of the 6th International Conference on Predictive Models in Software Engineering

Article No.: 8, Pages 1 - 9

https://doi.org/10.1145/1868328.1868341

Published: 12 September 2010 Publication History

Abstract

Background: Most of the defect prediction models are built for two purposes: 1) to detect defective and defect-free modules (binary classification), and 2) to estimate the number of defects (regression analysis). It would also be useful to give more information on the nature of defects so that software managers can plan their testing resources more effectively.

Aims: In this paper, we propose a defect prediction model that is based on defect categories.

Method: We mined the version history of a large-scale enterprise software product to extract churn and static code metrics. and grouped them into three defect categories according to different testing phases. We built a learning-based model for each defect category. We compared the performance of our proposed model with a general one. We conducted statistical techniques to evaluate the relationship between defect categories and software metrics. We also tested our hypothesis by replicating the empirical work on Eclipse data.

Results: Our results show that building models that are sensitive to defect categories is cost-effective in the sense that it reveals more information and increases detection rates (pd) by 10% keeping the false alarms (pf) constant.

Conclusions: We conclude that slicing defect data and categorizing it for use in a defect prediction model would enable practitioners to take immediate actions. Our results on Eclipse replication showed that haphazard categorization of defects is not worth the effort.

References

[1]

}}Eclipse project website. http://www.eclipse.org.

[2]

}}B. Caglayan, A. Bener, and S. Koch. Merits of using repository metrics in defect prediction for open source projects. 2009 ICSE Workshop on Emerging Trends in Free/Libre/Open Source Software Research and Development, pages 31--36, May 2009.

Digital Library

[3]

}}V. Dallmeier and T. Zimmermann. Extraction of bug localization benchmarks from history. In Proceedings of the 22nd IEEE/ACM International Conference on Automated Software Engineering, November 2007.

Digital Library

[4]

}}G. Di Fatta, S. Leue, and E. Stegantova. Discriminative pattern mining in software fault detection. In SOQUA '06: Proceedings of the 3rd international workshop on Software quality assurance, pages 62--69, New York, NY, USA, 2006. ACM.

Digital Library

[5]

}}M. Hall, E. Frank, G. Holmes, B. Pfahringer, P. Reutemann, and I. H. Witten. The weka data mining software: an update. SIGKDD Explor. Newsl., 11(1):10--18, 2009.

Digital Library

[6]

}}M. P. Jacek Ratzinger and H. Gall. EQ-Mine: Predicting Short-Term Defects for Software Evolution. In Proceedings of the Fundamental Approaches to Software Engineering at the European Joint Conference on Theory and Practice of Software, pages 12--26. Springer Berlin, 2007.

Digital Library

[7]

}}E. Kocaguneli, A. Tosun, A. B. Bener, B. Turhan, and B. Caglayan. Prest: An intelligent software metrics extraction, analysis and defect prediction tool. In SEKE, pages 637--642, 2009.

[8]

}}A. G. Koru and K. E. Emam. The theory of relative dependency: Higher coupling concentration in smaller modules. IEEE Software, 27:81--89, 2010.

Digital Library

[9]

}}A. G. Koru, D. Zhang, K. El Emam, and H. Liu. An investigation into the functional form of the size-defect relationship for software modules. IEEE Trans. Softw. Eng., 35(2):293--304, 2009.

Digital Library

[10]

}}M. Leszak, D. E. Perry, and D. Stoll. Classification and evaluation of defects in a project retrospective. J. Syst. Softw., 61(3):173--187, 2002.

Digital Library

[11]

}}M. A. Maloof. Learning when data sets are imbalanced and when costs are unequal and unknown. In ICML-2003 Workshop on Learning from Imbalanced Data Sets II, 2003.

[12]

}}T. Menzies, J. Greenwald, and A. Frank. Data mining static code attributes to learn defect predictors. Software Engineering, IEEE Transactions on, 33(1):2--13--, 2007.

Digital Library

[13]

}}G. J. Myers, T. Badgett, T. Thomas, and C. Sandler. The Art of Software Testing. 2nd ed. John Wiley & Sons, 2004.

Digital Library

[14]

}}N. Nagappan and T. Ball. Using software dependencies and churn metrics to predict field failures: An empirical case study. In ESEM '07: Proceedings of the First International Symposium on Empirical Software Engineering and Measurement, pages 364--373, Washington, DC, USA, 2007. IEEE Computer Society.

Digital Library

[15]

}}N. Nagappan, L. Williams, M. Vouk, and J. Osborne. Using in-process testing metrics to estimate post-release field quality. In ISSRE '07: Proceedings of the The 18th IEEE International Symposium on Software Reliability, pages 209--214, Washington, DC, USA, 2007. IEEE Computer Society.

Digital Library

[16]

}}T. J. Ostrand, E. J. Weyuker, and R. M. Bell. Predicting the location and number of faults in large software systems. IEEE Transactions on Software Engineering, 31(4):340--355--, 2005.

Digital Library

[17]

}}A. Schroter, T. Zimmermann, R. Premraj, and A. Zeller. If your bug database could talk. In Proceedings of the 5th International Symposium on Empirical Software Engineering, Volume II: Short Papers and Posters, pages 18--20, 2006.

[18]

}}C. Stringfellow, A. Andrews, C. Wohlin, and H. Petersson. Estimating the number of components with defects post-release that showed no defects in testing. Software Testing, Verification and Reliability, 12(2):93--122, 2002.

[19]

}}A. Tosun, B. Turhan, and A. Bener. Practical considerations in deploying ai for defect prediction: a case study within the turkish telecommunication industry. In PROMISE '09: Proceedings of the 5th International Conference on Predictor Models in Software Engineering, pages 1--9, New York, NY, USA, 2009. ACM.

Digital Library

[20]

}}A. Tosun, B. Turhan, and A. Bener. Validation of network measures as indicators of defective modules in software systems. In PROMISE '09: Proceedings of the 5th International Conference on Predictor Models in Software Engineering, pages 1--9, New York, NY, USA, 2009. ACM.

Digital Library

[21]

}}B. Turhan and A. Bener. A multivariate analysis of static code attributes for defect prediction. In Quality Software, 2007. QSIC '07. Seventh International Conference on, pages 231--237, 2007.

Digital Library

[22]

}}B. Turhan, T. Menzies, A. Bener, and J. Distefano. On the relative value of cross-company and within-company data for defect prediction. Empirical Software Engineering Journal, 2009. in print. DOI 10.1007/s10664-008-9103-7.

Digital Library

[23]

}}T. Zimmermann, R. Premraj, and A. Zeller. Predicting defects for eclipse. In PROMISE '07: Proceedings of the Third International Workshop on Predictor Models in Software Engineering, page 9, Washington, DC, USA, 2007. IEEE Computer Society.

Digital Library

Cited By

Long NPhuong HBinh N(2023)A Comparative Study of Wrapper Feature Selection Techniques in Software Fault PredictionThe 12th Conference on Information Technology and Its Applications10.1007/978-3-031-36886-8_6(62-73)Online publication date: 26-Jul-2023
https://doi.org/10.1007/978-3-031-36886-8_6
Pal SSillitti A(2022)Cross-Project Defect Prediction: A Literature ReviewIEEE Access10.1109/ACCESS.2022.322118410(118697-118717)Online publication date: 2022
https://doi.org/10.1109/ACCESS.2022.3221184
Qu YZheng QChi JJin YHe ACui DZhang HLiu T(2021)Using K-core Decomposition on Class Dependency Networks to Improve Bug Prediction Model's Practical PerformanceIEEE Transactions on Software Engineering10.1109/TSE.2019.289295947:2(348-366)Online publication date: 1-Feb-2021
https://dl.acm.org/doi/10.1109/TSE.2019.2892959
Show More Cited By

Index Terms

Usage of multiple prediction models based on defect categories

Recommendations

Cross-project smell-based defect prediction
Abstract
Defect prediction is a technique introduced to optimize the testing phase of the software development pipeline by predicting which components in the software may contain defects. Its methodology trains a classifier with data regarding a set of ...
Modeling Structural Model for Defect Categories Based On Software Metrics for Categorical Defect Prediction
ICCCT '15: Proceedings of the Sixth International Conference on Computer and Communication Technology 2015

Software Defect prediction is the pre-eminent area of software engineering which has witnessed huge importance over last decades. The identification of defects in the early stages of software development not only improve the quality of the software ...
Heterogeneous defect prediction
ESEC/FSE 2015: Proceedings of the 2015 10th Joint Meeting on Foundations of Software Engineering

Software defect prediction is one of the most active research areas in software engineering. We can build a prediction model with defect data collected from a software project and predict defects in the same project, i.e. within-project defect ...

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences

PROMISE '10: Proceedings of the 6th International Conference on Predictive Models in Software Engineering

September 2010

195 pages

ISBN:9781450304047

DOI:10.1145/1868328

General Chair:
Tim Menzies
West Virginia University
,
Program Chair:
Gunes Koru
University of Maryland Baltimore County

Copyright © 2010 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 12 September 2010

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

Conference

Promise '10

Promise '10: The 6th International Conference on Predictive Models in Software Engineering

September 12 - 13, 2010

Timişoara, Romania

Acceptance Rates

PROMISE '10 Paper Acceptance Rate 19 of 53 submissions, 36%;

Overall Acceptance Rate 98 of 213 submissions, 46%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

15
Total Citations
View Citations
505
Total Downloads

Downloads (Last 12 months)4
Downloads (Last 6 weeks)2

Reflects downloads up to 30 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

Long NPhuong HBinh N(2023)A Comparative Study of Wrapper Feature Selection Techniques in Software Fault PredictionThe 12th Conference on Information Technology and Its Applications10.1007/978-3-031-36886-8_6(62-73)Online publication date: 26-Jul-2023
https://doi.org/10.1007/978-3-031-36886-8_6
Pal SSillitti A(2022)Cross-Project Defect Prediction: A Literature ReviewIEEE Access10.1109/ACCESS.2022.322118410(118697-118717)Online publication date: 2022
https://doi.org/10.1109/ACCESS.2022.3221184
Qu YZheng QChi JJin YHe ACui DZhang HLiu T(2021)Using K-core Decomposition on Class Dependency Networks to Improve Bug Prediction Model's Practical PerformanceIEEE Transactions on Software Engineering10.1109/TSE.2019.289295947:2(348-366)Online publication date: 1-Feb-2021
https://dl.acm.org/doi/10.1109/TSE.2019.2892959
Pal SSillitti A(2021)A Classification of Software Defect Prediction Models2021 International Conference "Nonlinearity, Information and Robotics" (NIR)10.1109/NIR52917.2021.9666110(1-6)Online publication date: 26-Aug-2021
https://doi.org/10.1109/NIR52917.2021.9666110
Mehta SPatnaik K(2021)Improved prediction of software defects using ensemble machine learning techniquesNeural Computing and Applications10.1007/s00521-021-05811-333:16(10551-10562)Online publication date: 1-Aug-2021
https://dl.acm.org/doi/10.1007/s00521-021-05811-3
Mehta SPatnaik K(2021)Stacking Based Ensemble Learning for Improved Software Defect PredictionProceeding of Fifth International Conference on Microelectronics, Computing and Communication Systems10.1007/978-981-16-0275-7_14(167-178)Online publication date: 10-Sep-2021
https://doi.org/10.1007/978-981-16-0275-7_14
Joshi SDeshpande BPunnekkat SGorthi RSarkar SMedvidovic NKulkarni VKumar AJoshi PInverardi PSureka ASharma R(2017)Do Software Reliability Prediction Models Meet Industrial Perceptions?Proceedings of the 10th Innovations in Software Engineering Conference10.1145/3021460.3021467(66-73)Online publication date: 5-Feb-2017
https://dl.acm.org/doi/10.1145/3021460.3021467
Felix ELee S(2017)Integrated Approach to Software Defect PredictionIEEE Access10.1109/ACCESS.2017.27591805(21524-21547)Online publication date: 2017
https://doi.org/10.1109/ACCESS.2017.2759180
Tsakiltsidis SMiranskyy AMazzawi E(2016)On Automatic Detection of Performance Bugs2016 IEEE International Symposium on Software Reliability Engineering Workshops (ISSREW)10.1109/ISSREW.2016.43(132-139)Online publication date: Oct-2016
https://doi.org/10.1109/ISSREW.2016.43
Misirli AShihab EKamei Y(2016)Studying high impact fix-inducing changesEmpirical Software Engineering10.1007/s10664-015-9370-z21:2(605-641)Online publication date: 1-Apr-2016
https://dl.acm.org/doi/10.1007/s10664-015-9370-z
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten