Authors:
Tome Eftimov
1
;
Peter Korošec
2
and
Barbara Koroušić Seljak
3
Affiliations:
1
Jožef Stefan Institute and Jožef Stefan International Postgraduate School, Slovenia
;
2
Jožef Stefan Institute, Faculty of Mathematics and Natural Science and Information Technologies, Slovenia
;
3
Jožef Stefan Institute, Slovenia
Keyword(s):
Statistical Comparison, Single Objective Functions, Deep Statistics, Stochastic Optimization Algorithms.
Abstract:
Deep Statistical Comparison (DSC) is a recently proposed approach for the statistical comparison of meta-heuristic stochastic algorithms for single-objective optimization. The main contribution of the DSC is a ranking scheme, which is based on the whole distribution, instead of using only one statistic, such as average or median, which are commonly used. Contrary to common approach, the DSC gives more robust statistical results, which are not affected by outliers or misleading ranking scheme. The DSC ranking scheme uses a statistical test for comparing distributions in order to rank the algorithms. DSC was tested using the two-sample Kolmogorov-Smirnov (KS) test. However, distributions can be compared using different criteria, statistical tests. In this paper, we analyze the behavior of the DSC using two different criteria, the two-sample Kolmogorov-Smirnov (KS) test and the Anderson-Darling (AD) test. Experimental results from benchmark tests consisting of single-objective problems,
show that both criteria behave similarly. However, when algorithms are compared on a single problem, it is better to use the AD test because it is more powerful and can better detect differences than the KS test when the distributions vary in shift only, in scale only, in symmetry only, or have the same mean and standard deviation but differ on the tail ends only. This influence is not emphasized when the approach is used for multiple-problem analysis.
(More)