select the appropriate code coverage tool. However, they did not conduct an actual
experiment to examine the effectiveness of these tools.
Moreover, Shahid and Ibrahim [5] surveyed 19 code coverage tools. They compared,
theoretically, five features: programming language support, instrumentation (source code,
code byte), coverage metrics (statement, branch, method, and class), and GUI support and
reporting formats. This information was collected from literature and the websites of tools
but they did not conduct an experiment to exercise the variance in coverage metric values of
these tools.
In fact, some researches performed experiments that focused on the large software
systems. On the one hand, the code coverage tools provide developers a huge coverage data
to identify the tested areas, but on the other hand, the analysis process of this huge data is a
time-consuming task. Therefore, Asaf et al., [36] proposed an approach to define numerous
of views onto coverage data to improve the coverage analysis. Furthermore, Kessis et al.,
[3] presented test and coverage analysis of J2EE servers. Mainly, they aimed to provide a
real case study that consists of test and coverage analysis of JOnAS server. To run this
experiment, they used JOnAS middleware (200.000 LOC) and more than 2500 test cases as
well as using clover analyzer. They had presented an empirical evidence of applicability of
the coverage analysis with large Java application.
In addition to that, Kim [6] investigated, empirically, the efficient way to perform code
coverage analysis on large software projects. Therefore, he examined coarse coverage
analysis versus detailed coverage analysis, block coverage versus decision coverage, and
cyclomatic complexity versus defect and module size. This study used a large software
system with 19,800K LOC. According to his findings, he proposed a systematic approach of
coverage analysis for large software systems. Finally, Elbaum et al. [39] examined,
empirically, the impact of software evolution on coverage information. They used statement
coverage and function coverage metrics in their experiment. In addition, they found that the
changes during evolution of software impact the quality of coverage information. However,
they did not study the variance in coverage criteria using code coverage tools.
2.2. Metrics for Evaluating Code Coverage Tools
To evaluate the code coverage tools in quantitative and qualitative way, some studies
have presented set of metrics for evaluating code coverage tools. Moreover, these metrics
have allowed researchers and practitioners understand the features of code coverage tools. In
addition, these metrics may help them to choose an appropriate tool among a set of tools.
Therefore, Michael et al., [7] proposed a suite of metrics for evaluating tool features.
These metrics assist the researchers and practitioners choose an appropriate code coverage
tool. This suite of metrics consists of 13 metrics such as Human Integrate Design (HID) and
Maturity and Customer Base (MCB). So, the proposed metrics have been used to evaluate
the features of code coverage tools without considering the variance in coverage metric
values. Moreover, Priya et al., [8] conducted an experiment to examine the suite of metrics
that was proposed in [7] to support testing procedural software. In this experiment, the
researchers considered 9 small programs and four code coverage tools to calculate the
proposed metrics but they did not focus on the variance in values of coverage metrics.
Furthermore, Kajo-Mece and Tartari [9] conducted an experiment that examined 2 code
coverage tools Emma and Clover using very simple java programs for search and sort
algorithms. In addition, they calculated four metrics proposed in [7] to judge which code
coverage tool can be used efficiently by testing team. These metrics are: Reporting Features
(RF), Ease of Use (EU), Response Time (RT), and Human-and Interface Design (HID).