research-article

Merits of organizational metrics in defect prediction: an industrial replication

Authors:

Andriy Miransky,

Enzo CialiniAuthors Info & Claims

ICSE '15: Proceedings of the 37th International Conference on Software Engineering - Volume 2

Pages 89 - 98

Published: 16 May 2015 Publication History

Abstract

Defect prediction models presented in the literature lack generalization unless the original study can be replicated using new datasets and in different organizational settings. Practitioners can also benefit from replicating studies in their own environment by gaining insights and comparing their findings with those reported. In this work, we replicated an earlier study in order to investigate the merits of organizational metrics in building defect prediction models for large-scale enterprise software. We mined the organizational, code complexity, code churn and pre-release bug metrics of that large scale software and built defect prediction models for each metric set. In the original study, organizational metrics were found to achieve the highest performance. In our case, models based on organizational metrics performed better than models based on churn metrics but were outperformed by pre-release metric models. Further, we verified four individual organisational metrics as indicators for defects. We conclude that the performance of different metric sets in building defect prediction models depends on the project's characteristics and the targeted prediction level. Our replication of earlier research enabled assessing the validity and limitations of organisational metrics in a different context.

References

[1]

S. Alhassan, B. Caglayan, and A. B. Bener. Do more people make the code more defect prone?: Social network analysis in oss projects. In SEKE, pages 93--98, 2010.

[2]

V. R. Basili, F. Shull, and F. Lanubile. Building knowledge through families of experiments. Software Engineering, IEEE Transactions on, 25(4):456--473, 1999.

Digital Library

[3]

N. Bettenburg and A. E. Hassan. Studying the impact of social interactions on software quality. Empirical Software Engineering, 18(2):375--431, 2013.

Digital Library

[4]

C. Bird, N. Nagappan, P. Devanbu, H. Gall, and B. Murphy. Does distributed development affect software quality?: an empirical case study of windows vista. Communications of the ACM, 52(8):85--93, 2009.

Digital Library

[5]

C. Bird, N. Nagappan, B. Murphy, H. Gall, and P. Devanbu. Don't touch my code!: examining the effects of ownership on software quality. In Proceedings of the 19th ACM SIGSOFT symposium and the 13th European conference on Foundations of software engineering, pages 4--14. ACM, 2011.

Digital Library

[6]

A. Brooks, J. Daly, J. Miller, M. Roper, and M. Wood. Replication of experimental results in software engineering. International Software Engineering Research Network (ISERN) Technical Report ISERN-96-10, University of Strathclyde, 1996.

[7]

B. Caglayan, A. B. Bener, and A. Miranskyy. Emergence of developer teams in the collaboration network. In Cooperative and Human Aspects of Software Engineering (CHASE), 2013 6th International Workshop on, pages 33--40. IEEE, 2013.

[8]

J. Carver. Towards reporting guidelines for experimental replications: A proposal. In Proceedings of the 1st International Workshop on Replication in Empirical Software Engineering Research, (RESER) {Held during ICSE'10}, Cape Town, South Africa, 2010.

[9]

M. Cataldo and S. Nambiar. On the relationship between process maturity and geographic distribution: an empirical analysis of their impact on software quality. In Proceedings of the the 7th joint meeting of the European software engineering conference and the ACM SIGSOFT symposium on The foundations of software engineering, pages 101--110. ACM, 2009.

Digital Library

[10]

M. Conway. How do committees invent. Datamation, 14(4):28--31, 1968.

[11]

T. Hall, S. Beecham, D. Bowes, D. Gray, and S. Counsell. A systematic literature review on fault prediction performance in software engineering. IEEE Trans. Softw. Eng., 38(6):1276--1304, Nov. 2012.

Digital Library

[12]

J. D. Herbsleb and A. Mockus. An empirical study of speed and communication in globally distributed software development. Software Engineering, IEEE Transactions on, 29(6):481--494, 2003.

Digital Library

[13]

M. Host, C. Wohlin, and T. Thelin. Experimental context classification: incentives and experience of subjects. In Software Engineering, 2005. ICSE 2005. Proceedings. 27th International Conference on, pages 470--478. IEEE, 2005.

Digital Library

[14]

M. Jiménez, M. Piattini, and A. Vizcaíno. Challenges and improvements in distributed software development: A systematic review. Advances in Software Engineering, 2009:3, 2009.

Digital Library

[15]

N. Juristo, A. M. Moreno, and S. Vegas. Reviewing 25 years of testing technique experiments. Empirical Software Engineering, 9(1-2):7--44, 2004.

Digital Library

[16]

B. A. Kitchenham, T. Dyba, and M. Jorgensen. Evidence-based software engineering. In Software Engineering, 2004. ICSE 2004. Proceedings. 26th International Conference on, pages 273--281. IEEE, 2004.

Digital Library

[17]

A. Meneely, L. Williams, W. Snipes, and J. Osborne. Predicting failures with developer networks and social network analysis. In Proceedings of the 16th ACM SIGSOFT International Symposium on Foundations of software engineering, pages 13--23. ACM, 2008.

Digital Library

[18]

T. Menzies, B. Caglayan, Z. He, E. Kocaguneli, J. Krall, F. Peters, and B. Turhan. The promise repository of empirical software engineering data, June 2012.

[19]

N. Nagappan, B. Murphy, and V. Basili. The influence of organizational structure on software quality: an empirical case study. In Proceedings of the 30th International Conference on Software Engineering, ICSE'08, pages 521--530, New York, NY, USA, 2008. ACM.

Digital Library

[20]

C. B. Seaman. Qualitative methods in empirical studies of software engineering. Software Engineering, IEEE Transactions on, 25(4):557--572, 1999.

Digital Library

[21]

E. Shihab, C. Bird, and T. Zimmermann. The effect of branching strategies on software quality. In Proceedings of the ACM-IEEE international symposium on Empirical software engineering and measurement, pages 301--310. ACM, 2012.

Digital Library

[22]

F. Shull. Research 2.0? Software, IEEE, 29(6):4--8, Nov 2012.

Digital Library

[23]

F. Shull, V. Basili, J. Carver, J. C. Maldonado, G. H. Travassos, M. Mendonça, and S. Fabbri. Replicating software engineering experiments: addressing the tacit knowledge problem. In Empirical Software Engineering, 2002. Proceedings. 2002 International Symposium n, pages 7--16. IEEE, 2002.

Digital Library

[24]

D. I. Sjoberg, T. Dyba, and M. Jorgensen. The future of empirical methods in software engineering research. In Future of Software Engineering, 2007. FOSE'07, pages 358--378. IEEE, 2007.

Digital Library

[25]

C. Wohlin, P. Runeson, M. Höst, M. C. Ohlsson, B. Regnell, and A. Wesslén. Experimentation in software engineering. Springer, 2012.

Cited By

Bock TSchmid AApel S(2021)Measuring and Modeling Group Dynamics in Open-Source Software Development: A Tensor Decomposition ApproachACM Transactions on Software Engineering and Methodology10.1145/347313931:2(1-50)Online publication date: 17-Nov-2021
https://dl.acm.org/doi/10.1145/3473139
Perera AAleti ABöhme MTurhan BGrundy JLe Goues CLo D(2020)Defect prediction guided search-based software testingProceedings of the 35th IEEE/ACM International Conference on Automated Software Engineering10.1145/3324884.3416612(448-460)Online publication date: 21-Dec-2020
https://dl.acm.org/doi/10.1145/3324884.3416612
Perera AGrundy JLe Goues CLo D(2020)Using defect prediction to improve the bug detection capability of search-based software testingProceedings of the 35th IEEE/ACM International Conference on Automated Software Engineering10.1145/3324884.3415286(1170-1174)Online publication date: 21-Dec-2020
https://dl.acm.org/doi/10.1145/3324884.3415286
Show More Cited By

Index Terms

Merits of organizational metrics in defect prediction: an industrial replication
1. General and reference
  1. Cross-computing tools and techniques
    1. Metrics
2. Software and its engineering
  1. Software organization and properties
    1. Software system structures
      1. Software architectures

Recommendations

DDG-Based Optimization Metrics for Defect Prediction
Artificial Intelligence and Security
Abstract
Software defect prediction helps improve software quality and allocate software test resources reasonably. Many defect prediction models based on software metrics have been proposed. However, the existing software metrics are mainly focused on ...
Heterogeneous defect prediction
ESEC/FSE 2015: Proceedings of the 2015 10th Joint Meeting on Foundations of Software Engineering

Software defect prediction is one of the most active research areas in software engineering. We can build a prediction model with defect data collected from a software project and predict defects in the same project, i.e. within-project defect ...
Software fault prediction metrics

ContextSoftware metrics may be used in fault prediction models to improve software quality by predicting fault location. ObjectiveThis paper aims to identify software metrics and to assess their applicability in software fault prediction. We ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

ICSE '15: Proceedings of the 37th International Conference on Software Engineering - Volume 2

May 2015

1058 pages

General Chair:
Antonia Bertolino
ISTI-CNR, Italy
,
Program Chairs:
Gerardo Canfora
University of Sannio, Italy
,
Sebastian Elbaum
University of Nebraska-Lincoln

Sponsors

ACM: Association for Computing Machinery
SIGSOFT: ACM Special Interest Group on Software Engineering
IEEE-CS\DATC: IEEE Computer Society
TCSE: IEEE Computer Society's Tech. Council on Software Engin.

Publisher

IEEE Press

Publication History

Published: 16 May 2015

Check for updates

Qualifiers

Research-article

Conference

ICSE '15

Sponsor:

ACM
SIGSOFT
IEEE-CS\DATC
TCSE

ICSE '15: 37th International Conference on Software Engineering

May 16 - 24, 2015

Florence, Italy

Acceptance Rates

Overall Acceptance Rate 276 of 1,856 submissions, 15%

Upcoming Conference

ICSE 2025

2025 IEEE/ACM 46th International Conference on Software Engineering

April 26 - May 3, 2025

Ottawa , ON , Canada

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

8
Total Citations
View Citations
121
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 20 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

Bock TSchmid AApel S(2021)Measuring and Modeling Group Dynamics in Open-Source Software Development: A Tensor Decomposition ApproachACM Transactions on Software Engineering and Methodology10.1145/347313931:2(1-50)Online publication date: 17-Nov-2021
https://dl.acm.org/doi/10.1145/3473139
Perera AAleti ABöhme MTurhan BGrundy JLe Goues CLo D(2020)Defect prediction guided search-based software testingProceedings of the 35th IEEE/ACM International Conference on Automated Software Engineering10.1145/3324884.3416612(448-460)Online publication date: 21-Dec-2020
https://dl.acm.org/doi/10.1145/3324884.3416612
Perera AGrundy JLe Goues CLo D(2020)Using defect prediction to improve the bug detection capability of search-based software testingProceedings of the 35th IEEE/ACM International Conference on Automated Software Engineering10.1145/3324884.3415286(1170-1174)Online publication date: 21-Dec-2020
https://dl.acm.org/doi/10.1145/3324884.3415286
Yatish SJiarpakdee JThongtanunam PTantithamthavorn CAtlee JBultan TWhittle J(2019)Mining software defectsProceedings of the 41st International Conference on Software Engineering10.1109/ICSE.2019.00075(654-665)Online publication date: 25-May-2019
https://dl.acm.org/doi/10.1109/ICSE.2019.00075
Tantithamthavorn CHassan APaulisch FBosch J(2018)An experience report on defect modelling in practiceProceedings of the 40th International Conference on Software Engineering: Software Engineering in Practice10.1145/3183519.3183547(286-295)Online publication date: 27-May-2018
https://dl.acm.org/doi/10.1145/3183519.3183547
Bowes DHall THarman MJia YSarro FWu FZeller ARoychoudhury A(2016)Mutation-aware fault predictionProceedings of the 25th International Symposium on Software Testing and Analysis10.1145/2931037.2931039(330-341)Online publication date: 18-Jul-2016
https://dl.acm.org/doi/10.1145/2931037.2931039
Carver JBurcham MKocak SBener AFelderer MGander MKing JMarkkula JOivo MSauerwein CWilliams LScherlis WBrumley D(2016)Establishing a baseline for measuring advancement in the science of securityProceedings of the Symposium and Bootcamp on the Science of Security10.1145/2898375.2898380(38-51)Online publication date: 19-Apr-2016
https://dl.acm.org/doi/10.1145/2898375.2898380
Thongtanunam PMcIntosh SHassan AIida HDillon LVisser WWilliams L(2016)Revisiting code ownership and its relationship with software quality in the scope of modern code reviewProceedings of the 38th International Conference on Software Engineering10.1145/2884781.2884852(1039-1050)Online publication date: 14-May-2016
https://dl.acm.org/doi/10.1145/2884781.2884852

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents