skip to main content
10.1145/3084226.3084250acmotherconferencesArticle/Chapter ViewAbstractPublication PageseaseConference Proceedingsconference-collections
research-article

Towards Confidence with Capture-recapture Estimation: An Exploratory Study of Dependence within Inspections

Published: 15 June 2017 Publication History

Abstract

Background: Capture-ReCapture (CRC), as a technique for post-inspection defect estimation, has been studied in Software Engineering (SE) community since 1990s. While most studies focused on the performance evaluation of various CRC models and estimators, few have been done on the assessment of the credibility of estimation results, rendering the difficulty of decision-making for quality management when applying CRC for defect estimation. Objective: This research aims to explore and investigate a reliable and practical approach to assess the credibility of CRC based defect estimation. Method: One fundamental assumption of applying CRC method is the statistical independence of samples that can be measured by 'Coefficient of CoVariation' (CCV). We applied CCV as an indicator of the statistical dependence between the observations (i.e., the defects detected by inspectors), and assessed the estimation results of CRC with the published datasets in SE literature by examining the correlation between Relative Error (RE) and CCV. Based on the observed correlation, we further propose CĈV, which replaces the unknown N (the actual number of defects) with the estimated number (N), to assess the credibility of CRC estimates. Results: We found that most datasets are with non-zero CCVs and the R2 (Coefficient of Determination) of non-linear curve-fitting for their CCVs and REs is higher than 0.8. Conclusions: Our study shows the evidence that the statistical dependence among inspectors is ubiquitous in the existing CRC-related studies. Besides, the significant correlation between CCV (by CĈV in practice) and RE may enable the possibility of the assessment of CRC-based estimation in support of quality management.

References

[1]
A. Bachmann and A. Bernstein. Software process data quality and characteristics: a historical view on open and closed source projects. In Proceedings of the joint international and annual ERCIM workshops on Principles of software evolution (IWPSE) and software evolution (Evol) workshops, pages 119--128. ACM, 2009.
[2]
S. Biffl. Evaluating defect estimation models with major defects. Journal of Systems and Software, 65(1):13--29, 2003.
[3]
L. Briand, K. E. Emam, O. Laitenberger, and T. Fussbroich. Using simulation to build inspection efficiency benchmarks fordevelopment projects. In International Conference on Software Engineering, pages 340--349, 1998.
[4]
L. C. Briand, K. El Emam, B. G. Freimut, and O. Laitenberger. A comprehensive evaluation of capture-recapture models for estimating software defect content. Software Engineering, IEEE Transactions on, 26(6):518--540, 2000.
[5]
K. P. Burnham and W. S. Overton. Estimation of the size of a closed population when capture probabilities vary among animals. Biometrika, 65(3):625--633, 1978.
[6]
A. Chao. Estimating the population size for capture-recapture data with unequal catchability. Biometrics, pages 783--791, 1987.
[7]
A. Chao. Estimating population size for sparse data in capture-recapture experiments. Biometrics, pages 427--438, 1989.
[8]
A. Chao. Capture-recapture for human populations. Wiley StatsRef: Statistics Reference Online, 2015.
[9]
A. Chao, W.-H. Hwang, Y. Chen, and C. Kuo. Estimating the number of shared species in two communities. Statistica sinica, 10(1):227--246, 2000.
[10]
A. Chao, P. Tsay, S.-H. Lin, W.-Y. Shau, and D.-Y. Chao. The applications of capture-recapture models to epidemiological data. Statistics in medicine, 20(20):3123--3157, 2001.
[11]
Y. H. Chun. Estimating the number of undetected software errors via the correlated capture--recapture model. European Journal of Operational Research, 175(2):1180--1192, 2006.
[12]
M. D'Ambros, M. Lanza, and R. Robbes. An extensive comparison of bug prediction approaches. In Mining Software Repositories (MSR), 2010 7th IEEE Working Conference on, pages 31--41. IEEE, 2010.
[13]
N. B. Ebrahimi. On the statistical analysis of the number of errors remaining in a software design document after inspection. Software Engineering, IEEE Transactions on, 23(8):529--532, 1997.
[14]
K. E. Emam and O. Laitenberger. Evaluating capture-recapture models with two inspectors. Software Engineering, IEEE Transactions on, 27(9):851--864, 2001.
[15]
N. E. Fenton and M. Neil. A critique of software defect prediction models. Software Engineering, IEEE Transactions on, 25(5):675--689, 1999.
[16]
A. Kamel and P. G. Sorenson. The application of capture-recapture log-linear models to software inspections data. IEEE, pages 213--222, 2003.
[17]
T. Lee, J. Nam, D. Han, S. Kim, and H. P. In. Micro interaction metrics for defect prediction. In Proceedings of the 19th ACM SIGSOFT symposium and the 13th European conference on Foundations of software engineering, pages 311--321. ACM, 2011.
[18]
F. C. Lincoln. Calculating waterfowl abundance on the basis of banding returns. Us Department of Agriculture Circular, 1930.
[19]
G. Liu, G. Rong, H. Zhang, and Q. Shan. The adoption of capture-recapture in software engineering: a systematic literature review. In Proceedings of the 19th International Conference on Evaluation and Assessment in Software Engineering, page 15. ACM, 2015.
[20]
J. Miller. On the independence of software inspectors. Journal of Systems and Software, 60(1):5--10, 2002.
[21]
F. Padberg. Empirical interval estimates for the defect content after an inspection. In Proceedings of the 24th International Conference on Software Engineering, pages 58--68. ACM, 2002.
[22]
H. Petersson, T. Thelin, P. Runeson, and C. Wohlin. Capture--recapture in software inspections after 10 years research----theory, evaluation and application. Journal of Systems and Software, 72(2):249--264, 2004.
[23]
H. Petersson and C. Wohlin. An empirical study of experience-based software defect content estimation methods. In Software Reliability Engineering, 1999. Proceedings. 10th International Symposium on, pages 126--135. IEEE, 1999.
[24]
S. C. Robles, L. D. Marrett, E. A. Clarke, and H. A. Risch. An application of capture-recapture methods to the estimation of completeness of cancer registration. Journal of clinical epidemiology, 41(5):495--501, 1988.
[25]
P. Runeson and C. Wohlin. An experimental evaluation of an experience-based capture-recapture method in software code inspections. Empirical Software Engineering, 3(4):381--406, 1998.
[26]
Q. Shan, G. Rong, H. Zhang, G. Liu, and D. Shao. An empirical evaluation of capture-recapture estimators in software inspection. In Proceddings of the 24th Australasian Software Engineering Conference. IEEE, 2015.
[27]
K. Srinivasan and D. Fisher. Machine learning approaches to estimating software development effort. Software Engineering, IEEE Transactions on, 21(2):126--137, 1995.
[28]
T. Thelin and P. Runeson. Confidence intervals for capture--recapture estimations in software inspections. Information and Software Technology, 44(12):683--702, 2002.
[29]
G. S. Walia and J. C. Carver. Evaluation of capture-recapture models for estimating the abundance of naturally-occurring defects. In Proceedings of the Second ACM-IEEE international symposium on Empirical software engineering and measurement, pages 158--167. ACM, 2008.
[30]
G. S. Walia, J. C. Carver, and N. Nagappan. The effect of the number of inspectors on the defect estimates produced by capture-recapture models. In Proceedings of the 30th international conference on Software engineering, pages 331--340. ACM, 2008.
[31]
J. Wittes and V. W. Sidel. A generalization of the simple capture-recapture model with applications to epidemiological research. Journal of chronic diseases, 21(5):287--301, 1968.
[32]
Q. Zhang, G. Rong, and H. Zhang. An empirical study on independence-driven data selection for improving capture-recapture estimation. In Proceedings of the 20th International Conference on Evaluation and Assessment in Software Engineering, page 19. ACM, 2016.

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences
EASE '17: Proceedings of the 21st International Conference on Evaluation and Assessment in Software Engineering
June 2017
405 pages
ISBN:9781450348041
DOI:10.1145/3084226
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

In-Cooperation

  • School of Computing, BTH: Blekinge Institute of Technology - School of Computing

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 15 June 2017

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. capture-recapture
  2. coefficient of covariation
  3. defect estimation
  4. statistical independence

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Funding Sources

Conference

EASE'17

Acceptance Rates

Overall Acceptance Rate 71 of 232 submissions, 31%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)2
  • Downloads (Last 6 weeks)0
Reflects downloads up to 14 Sep 2024

Other Metrics

Citations

Cited By

View all

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media