skip to main content
10.5555/2819009.2819025acmconferencesArticle/Chapter ViewAbstractPublication PagesicseConference Proceedingsconference-collections
research-article

Merits of organizational metrics in defect prediction: an industrial replication

Published: 16 May 2015 Publication History

Abstract

Defect prediction models presented in the literature lack generalization unless the original study can be replicated using new datasets and in different organizational settings. Practitioners can also benefit from replicating studies in their own environment by gaining insights and comparing their findings with those reported. In this work, we replicated an earlier study in order to investigate the merits of organizational metrics in building defect prediction models for large-scale enterprise software. We mined the organizational, code complexity, code churn and pre-release bug metrics of that large scale software and built defect prediction models for each metric set. In the original study, organizational metrics were found to achieve the highest performance. In our case, models based on organizational metrics performed better than models based on churn metrics but were outperformed by pre-release metric models. Further, we verified four individual organisational metrics as indicators for defects. We conclude that the performance of different metric sets in building defect prediction models depends on the project's characteristics and the targeted prediction level. Our replication of earlier research enabled assessing the validity and limitations of organisational metrics in a different context.

References

[1]
S. Alhassan, B. Caglayan, and A. B. Bener. Do more people make the code more defect prone?: Social network analysis in oss projects. In SEKE, pages 93--98, 2010.
[2]
V. R. Basili, F. Shull, and F. Lanubile. Building knowledge through families of experiments. Software Engineering, IEEE Transactions on, 25(4):456--473, 1999.
[3]
N. Bettenburg and A. E. Hassan. Studying the impact of social interactions on software quality. Empirical Software Engineering, 18(2):375--431, 2013.
[4]
C. Bird, N. Nagappan, P. Devanbu, H. Gall, and B. Murphy. Does distributed development affect software quality?: an empirical case study of windows vista. Communications of the ACM, 52(8):85--93, 2009.
[5]
C. Bird, N. Nagappan, B. Murphy, H. Gall, and P. Devanbu. Don't touch my code!: examining the effects of ownership on software quality. In Proceedings of the 19th ACM SIGSOFT symposium and the 13th European conference on Foundations of software engineering, pages 4--14. ACM, 2011.
[6]
A. Brooks, J. Daly, J. Miller, M. Roper, and M. Wood. Replication of experimental results in software engineering. International Software Engineering Research Network (ISERN) Technical Report ISERN-96-10, University of Strathclyde, 1996.
[7]
B. Caglayan, A. B. Bener, and A. Miranskyy. Emergence of developer teams in the collaboration network. In Cooperative and Human Aspects of Software Engineering (CHASE), 2013 6th International Workshop on, pages 33--40. IEEE, 2013.
[8]
J. Carver. Towards reporting guidelines for experimental replications: A proposal. In Proceedings of the 1st International Workshop on Replication in Empirical Software Engineering Research, (RESER) {Held during ICSE'10}, Cape Town, South Africa, 2010.
[9]
M. Cataldo and S. Nambiar. On the relationship between process maturity and geographic distribution: an empirical analysis of their impact on software quality. In Proceedings of the the 7th joint meeting of the European software engineering conference and the ACM SIGSOFT symposium on The foundations of software engineering, pages 101--110. ACM, 2009.
[10]
M. Conway. How do committees invent. Datamation, 14(4):28--31, 1968.
[11]
T. Hall, S. Beecham, D. Bowes, D. Gray, and S. Counsell. A systematic literature review on fault prediction performance in software engineering. IEEE Trans. Softw. Eng., 38(6):1276--1304, Nov. 2012.
[12]
J. D. Herbsleb and A. Mockus. An empirical study of speed and communication in globally distributed software development. Software Engineering, IEEE Transactions on, 29(6):481--494, 2003.
[13]
M. Host, C. Wohlin, and T. Thelin. Experimental context classification: incentives and experience of subjects. In Software Engineering, 2005. ICSE 2005. Proceedings. 27th International Conference on, pages 470--478. IEEE, 2005.
[14]
M. Jiménez, M. Piattini, and A. Vizcaíno. Challenges and improvements in distributed software development: A systematic review. Advances in Software Engineering, 2009:3, 2009.
[15]
N. Juristo, A. M. Moreno, and S. Vegas. Reviewing 25 years of testing technique experiments. Empirical Software Engineering, 9(1-2):7--44, 2004.
[16]
B. A. Kitchenham, T. Dyba, and M. Jorgensen. Evidence-based software engineering. In Software Engineering, 2004. ICSE 2004. Proceedings. 26th International Conference on, pages 273--281. IEEE, 2004.
[17]
A. Meneely, L. Williams, W. Snipes, and J. Osborne. Predicting failures with developer networks and social network analysis. In Proceedings of the 16th ACM SIGSOFT International Symposium on Foundations of software engineering, pages 13--23. ACM, 2008.
[18]
T. Menzies, B. Caglayan, Z. He, E. Kocaguneli, J. Krall, F. Peters, and B. Turhan. The promise repository of empirical software engineering data, June 2012.
[19]
N. Nagappan, B. Murphy, and V. Basili. The influence of organizational structure on software quality: an empirical case study. In Proceedings of the 30th International Conference on Software Engineering, ICSE'08, pages 521--530, New York, NY, USA, 2008. ACM.
[20]
C. B. Seaman. Qualitative methods in empirical studies of software engineering. Software Engineering, IEEE Transactions on, 25(4):557--572, 1999.
[21]
E. Shihab, C. Bird, and T. Zimmermann. The effect of branching strategies on software quality. In Proceedings of the ACM-IEEE international symposium on Empirical software engineering and measurement, pages 301--310. ACM, 2012.
[22]
F. Shull. Research 2.0? Software, IEEE, 29(6):4--8, Nov 2012.
[23]
F. Shull, V. Basili, J. Carver, J. C. Maldonado, G. H. Travassos, M. Mendonça, and S. Fabbri. Replicating software engineering experiments: addressing the tacit knowledge problem. In Empirical Software Engineering, 2002. Proceedings. 2002 International Symposium n, pages 7--16. IEEE, 2002.
[24]
D. I. Sjoberg, T. Dyba, and M. Jorgensen. The future of empirical methods in software engineering research. In Future of Software Engineering, 2007. FOSE'07, pages 358--378. IEEE, 2007.
[25]
C. Wohlin, P. Runeson, M. Höst, M. C. Ohlsson, B. Regnell, and A. Wesslén. Experimentation in software engineering. Springer, 2012.

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
ICSE '15: Proceedings of the 37th International Conference on Software Engineering - Volume 2
May 2015
1058 pages

Sponsors

Publisher

IEEE Press

Publication History

Published: 16 May 2015

Check for updates

Qualifiers

  • Research-article

Conference

ICSE '15
Sponsor:

Acceptance Rates

Overall Acceptance Rate 276 of 1,856 submissions, 15%

Upcoming Conference

ICSE 2025

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 20 Jan 2025

Other Metrics

Citations

Cited By

View all

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media