skip to main content
10.1145/1987875.1987888acmotherconferencesArticle/Chapter ViewAbstractPublication PagesicseConference Proceedingsconference-collections
research-article

Defect prediction using social network analysis on issue repositories

Published: 21 May 2011 Publication History

Abstract

People are the most important pillar of software development process. It is critical to understand how they interact with each other and how these interactions affect the quality of the end product in terms of defects. In this research we propose to include a new set of metrics, a.k.a. social network metrics on issue repositories in predicting defects. Social network metrics on issue repositories has not been used before to predict defect proneness of a software product. To validate our hypotheses we used two datasets, development data of IBM1 Rational ® Team Concert™ (RTC) and Drupal, to conduct our experiments. The results of the experiments revealed that compared to other set of metrics such as churn metrics using social network metrics on issue repositories either considerably decreases high false alarm rates without compromising the detection rates or considerably increases low prediction rates without compromising low false alarm rates. Therefore we recommend practitioners to collect social network metrics on issue repositories since people related information is a strong indicator of past patterns in a given team.

References

[1]
Drupal. http://drupal.org.
[2]
Ibm jazz project website. http://jazz.net/.
[3]
Ibm rational team concert website. http://www-01.ibm.com/software/awdtools/rtc/.
[4]
Java universal network/graph framework. http://jung.sourceforge.net/.
[5]
E. Alpaydin. Introduction to Machine Learning. The MIT Press, October 2004.
[6]
E. Arisholm and L. C. Briand. Predicting fault-prone components in a java legacy system. In ISESE '06: Proceedings of the 2006 ACM/IEEE international symposium on Empirical software engineering, pages 8--17, New York, NY, USA, 2006. ACM.
[7]
B. Caglayan, A. Bener, and S. Koch. Merits of using repository metrics in defect prediction for open source projects. In FLOSS '09: Proceedings of the 2009 ICSE Workshop on Emerging Trends in Free/Libre/Open Source Software Research and Development, pages 31--36, Washington, DC, USA, 2009. IEEE Computer Society.
[8]
C. Drummond and R. C. Holte. Cost curves: An improved method for visualizing classifier performance. Mach. Learn., 65(1):95--130, 2006.
[9]
N. E. Fenton and N. Ohlsson. Quantitative analysis of faults and failures in a complex software system. IEEE Trans. Software Eng., 26(8):797--814, 2000.
[10]
L. C. Freeman. A set of measures of centrality based on betweenness. Sociometry, 40(1):35--41, 1977.
[11]
L. C. Freeman. Centrality in social networks: Conceptual clarification. Social Networks, 1:215-- 239, 1979.
[12]
R. Frost. Jazz and the eclipse way of collaboration. IEEE Softw., 24(6):114--117, 2007.
[13]
M. Girvan and M. E. J. Newman. Community structure in social and biological networks. Proc Natl Acad Sci USA, 99(12):7821--6, June 2002.
[14]
T. L. Graves, A. F. Karr, J. S. Marron, and H. P. Siy. Predicting fault incidence using software change history. IEEE Trans. Software Eng., 26(7):653--661, 2000.
[15]
G. A. Hall and J. C. Munson. Software evolution: code delta and code churn. Journal of Systems and Software, 54(2):111--118, 2000.
[16]
M. Hall, E. Frank, G. Holmes, B. Pfahringer, P. Reutemann, and I. H. Witten. The weka data mining software: an update. SIGKDD Explor. Newsl., 11(1):10--18, 2009.
[17]
M. H. Halstead. Elements of Software Science. Elsevier, 1977.
[18]
Y. Jiang, B. Cukic, and T. Menzies. Fault prediction using early lifecycle data. In ISSRE '07: Proceedings of the The 18th IEEE International Symposium on Software Reliability, pages 237--246, Washington, DC, USA, 2007. IEEE Computer Society.
[19]
Y. Jiang, B. Cukic, and T. Menzies. Cost curve evaluation of fault prediction models. In ISSRE '08: Proceedings of the 2008 19th International Symposium on Software Reliability Engineering, pages 197--206, Washington, DC, USA, 2008. IEEE Computer Society.
[20]
S. Lessmann, B. Baesens, C. Mues, and S. Pietsch. Benchmarking classification models for software defect prediction: A proposed framework and novel findings. IEEE Trans. Software Eng., 34(4):485--496, 2008.
[21]
T. J. McCabe. A complexity measure. IEEE Trans. Software Eng., 2(4):308--320, 1976.
[22]
A. Meneely, L. Williams, W. Snipes, and J. Osborne. Predicting failures with developer networks and social network analysis. In SIGSOFT FSE, pages 13--23, 2008.
[23]
T. Menzies, J. Di Stefano, K. Ammar, K. McGill, P. Callis, R. M. Chapman, and J. Davis. When can we test less? In METRICS '03: Proceedings of the 9th International Symposium on Software Metrics, page 98, Washington, DC, USA, 2003. IEEE Computer Society.
[24]
T. Menzies, J. S. Di Stefano, M. Chapman, and K. McGill. Metrics that matter. In SEW '02: Proceedings of the 27th Annual NASA Goddard Software Engineering Workshop (SEW-27'02), page 51, Washington, DC, USA, 2002. IEEE Computer Society.
[25]
T. Menzies, J. Greenwald, and A. Frank. Data mining static code attributes to learn defect predictors. IEEE Trans. Software Eng., 33(1):2--13, 2007.
[26]
T. Menzies, B. Turhan, A. Bener, G. Gay, B. Cukic, and Y. Jiang. Implications of ceiling effects in defect predictors. In PROMISE '08: Proceedings of the 4th international workshop on Predictor models in software engineering, pages 47--54, New York, NY, USA, 2008. ACM.
[27]
R. Moser, W. Pedrycz, and G. Succi. A comparative analysis of the efficiency of change metrics and static code attributes for defect prediction. In ICSE '08: Proceedings of the 30th international conference on Software engineering, pages 181--190, 2008.
[28]
J. C. Munson and S. G. Elbaum. Code churn: A measure for estimating the impact of code change. In ICSM, pages 24--, 1998.
[29]
C. Nadeau and Y. Bengio. Inference for the generalization error. Machine Learning, 52(3):239--281, 2003.
[30]
N. Nagappan and T. Ball. Static analysis tools as early indicators of pre-release defect density. In ICSE, pages 580--586, 2005.
[31]
N. Nagappan, T. Ball, and A. Zeller. In Proceedings of the 28th international conference on Software engineering.
[32]
T. J. Ostrand, E. J. Weyuker, and R. M. Bell. Where the bugs are. In ISSTA, pages 86--96, 2004.
[33]
M. Pinzger and N. Nagappan. Can developer social networks predict failures? In In Proc. of the 16th ACM SIGSOFT Int. Symposium on Foundations of Soft. Eng., 2008.
[34]
F. Shull, V. R. Basili, B. W. Boehm, A. W. Brown, P. Costa, M. Lindvall, D. Port, I. Rus, R. Tesoriero, and M. V. Zelkowitz. What we have learned about fighting defects. In IEEE METRICS, page 249, 2002.
[35]
E. H. Sibley, V. R. Basili, and B. T. Perricone. Software Errors and Complexity: An Empirical Investigation. Communications of the ACM, 27(1):42--52, 1984.
[36]
Q. Song, M. Shepperd, M. Cartwright, and C. Mair. Software defect association mining and defect correction effort prediction. Software Engineering, IEEE Transactions on, 32(2):69--82, feb. 2006.
[37]
A. Tosun and A. Bener. Reducing false alarms in software defect prediction by decision threshold optimization. In ESEM '09: Proceedings of the 2009 3rd International Symposium on Empirical Software Engineering and Measurement, pages 477--480, Washington, DC, USA, 2009. IEEE Computer Society.
[38]
A. Tosun, B. Turhan, and A. Bener. Practical considerations in deploying ai for defect prediction: a case study within the turkish telecommunication industry. In PROMISE '09: Proceedings of the 5th International Conference on Predictor Models in Software Engineering, pages 1--9, New York, NY, USA, 2009. ACM.
[39]
F. Towfic, S. VanderPlas, C. A. Oliver, O. Couture, C. K. Tuggle, M. H. W. Greenlee, and V. Honavar. Detection of gene orthology from gene co-expression and protein interaction networks. In IEEE International Conference on Bioinformatics and Biomedicine. IEEE, 2009.
[40]
D. J. Watts and S. H. Strogatz. Collective dynamics of small-world networks. Nature, 393(6684):440--442, Jun 1998.
[41]
S. White and P. Smyth. Algorithms for estimating relative importance in networks. In KDD '03: Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining, pages 266--275, New York, NY, USA, 2003. ACM.
[42]
T. Wolf, A. Schröter, D. Damian, and T. H. D. Nguyen. Predicting build failures using social network analysis on developer communication. In ICSE, pages 1--11, 2009.
[43]
R. M. Yogesh Singh, Arvinder Kaur. Prediction of fault-prone software modules using statistical and machine learning methods. International Journal of Computer Applications, 1(22):8--15, February 2010. Published By Foundation of Computer Science.
[44]
T. Zimmermann and N. Nagappan. Predicting defects using network analysis on dependency graphs. In ICSE, pages 531--540, 2008.

Cited By

View all

Index Terms

  1. Defect prediction using social network analysis on issue repositories

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Other conferences
    ICSSP '11: Proceedings of the 2011 International Conference on Software and Systems Process
    May 2011
    256 pages
    ISBN:9781450307307
    DOI:10.1145/1987875
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 21 May 2011

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. defect prediction
    2. developer communication
    3. network metrics
    4. social networks

    Qualifiers

    • Research-article

    Conference

    ICSSP '11

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)6
    • Downloads (Last 6 weeks)1
    Reflects downloads up to 30 Jan 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2023)Organizational Influencers in Open-Source Software ProjectsInternational Journal of Open Source Software and Processes10.4018/IJOSSP.31840014:1(1-20)Online publication date: 16-Feb-2023
    • (2023)Tell Me Who Are You Talking to and I Will Tell You What Issues Need Your Skills2023 IEEE/ACM 20th International Conference on Mining Software Repositories (MSR)10.1109/MSR59073.2023.00087(611-623)Online publication date: May-2023
    • (2023)A Novel Approach to Improve Software Defect Prediction Accuracy Using Machine LearningIEEE Access10.1109/ACCESS.2023.328732611(63579-63597)Online publication date: 2023
    • (2022)Improving Defect Prediction Using Combination of Software Metrics2022 International Conference on Data and Software Engineering (ICoDSE)10.1109/ICoDSE56892.2022.9971813(89-94)Online publication date: 2-Nov-2022
    • (2020)Predicting the bug fixing time using word embedding and deep long short term memoriesIET Software10.1049/iet-sen.2019.026014:3(203-212)Online publication date: Jun-2020
    • (2017)Using contextual information to predict co-changesJournal of Systems and Software10.1016/j.jss.2016.07.016128:C(220-235)Online publication date: 1-Jun-2017
    • (2017)Mining Rational Team Concert Repositories: A Case Study on a Software ProjectProgress in Artificial Intelligence10.1007/978-3-319-65340-2_44(537-548)Online publication date: 9-Aug-2017
    • (2015)The Art and Science of Analyzing Software DataundefinedOnline publication date: 15-Sep-2015
    • (2014)Effect of temporal collaboration network, maintenance activity, and experience on defect exposureProceedings of the 8th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement10.1145/2652524.2652586(1-8)Online publication date: 18-Sep-2014
    • (2014)Social metrics included in prediction models on software engineeringProceedings of the 10th International Conference on Predictive Models in Software Engineering10.1145/2639490.2639505(72-81)Online publication date: 17-Sep-2014
    • Show More Cited By

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media