Efficient Estimation of Generalization Error and Bias-Variance Components of Ensembles

Mahajan, Dhruv; Gupta, Vivek; Keerthi, S Sathiya; Sundararajan, Sellamanickam; Narayanamurthy, Shravan; Kidambi, Rahul

Computer Science > Machine Learning

arXiv:1711.05482 (cs)

[Submitted on 15 Nov 2017]

Title:Efficient Estimation of Generalization Error and Bias-Variance Components of Ensembles

Authors:Dhruv Mahajan, Vivek Gupta, S Sathiya Keerthi, Sellamanickam Sundararajan, Shravan Narayanamurthy, Rahul Kidambi

View PDF

Abstract:For many applications, an ensemble of base classifiers is an effective solution. The tuning of its parameters(number of classes, amount of data on which each classifier is to be trained on, etc.) requires G, the generalization error of a given ensemble. The efficient estimation of G is the focus of this paper. The key idea is to approximate the variance of the class scores/probabilities of the base classifiers over the randomness imposed by the training subset by normal/beta distribution at each point x in the input feature space. We estimate the parameters of the distribution using a small set of randomly chosen base classifiers and use those parameters to give efficient estimation schemes for G. We give empirical evidence for the quality of the various estimators. We also demonstrate their usefulness in making design choices such as the number of classifiers in the ensemble and the size of a subset of data used for training that is needed to achieve a certain value of generalization error. Our approach also has great potential for designing distributed ensemble classifiers.

Comments:	12 Pages, 4 Figures, 12 Pages, Under Review in SDM 2018
Subjects:	Machine Learning (cs.LG); Machine Learning (stat.ML)
Cite as:	arXiv:1711.05482 [cs.LG]
	(or arXiv:1711.05482v1 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.1711.05482

Submission history

From: Vivek Gupta [view email]
[v1] Wed, 15 Nov 2017 10:03:01 UTC (383 KB)

Computer Science > Machine Learning

Title:Efficient Estimation of Generalization Error and Bias-Variance Components of Ensembles

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Efficient Estimation of Generalization Error and Bias-Variance Components of Ensembles

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators