[edit]
Consistent Hierarchical Classification with A Generalized Metric
Proceedings of The 27th International Conference on Artificial Intelligence and Statistics, PMLR 238:4825-4833, 2024.
Abstract
In multi-class hierarchical classification, a natural evaluation metric is the tree distance loss that takes the value of two labels’ distance on the pre-defined tree hierarchy. This metric is motivated by that its Bayes optimal solution is the deepest label on the tree whose induced superclass (subtree rooted at it) includes the true label with probability at least $\frac{1}{2}$. However, it can hardly handle the risk sensitivity of different tasks since its accuracy requirement for induced superclasses is fixed at $\frac{1}{2}$. In this paper, we first introduce a new evaluation metric that generalizes the tree distance loss, whose solution’s accuracy constraint $\frac{1+c}{2}$ can be controlled by a penalty value $c$ tailored for different tasks: a higher c indicates the emphasis on prediction’s accuracy and a lower one indicates that on specificity. Then, we propose a novel class of consistent surrogate losses based on an intuitive presentation of our generalized metric and its regret, which can be compatible with various binary losses. Finally, we theoretically derive the regret transfer bounds for our proposed surrogates and empirically validate their usefulness on benchmark datasets.