loading
Papers Papers/2022 Papers Papers/2022

Research.Publish.Connect.

Paper

Paper Unlock

Authors: Sanzhar Aubakirov 1 ; Paulo Trigo 2 and Darhan Ahmed-Zaki 1

Affiliations: 1 Department of Computer Science, Al-Farabi Kazakh National University, Almaty and Kazakhstan ; 2 Instituto Superior de Engenharia de Lisboa, Biosystems and Integrative Sciences Institute Agent and Systems Modeling, Lisbon and Portugal

Keyword(s): Distributed Computing, Text Processing, Machine Learning, Hyperparameters Optimization.

Related Ontology Subjects/Areas/Topics: Business Analytics ; Data Engineering ; Data Management and Quality ; Text Analytics

Abstract: In this paper, we propose an optimization workflow to predict classifiers accuracy based on the exploration of the space composed of different data features and the configurations of the classification algorithms. The overall process is described considering the text classification problem. We take three main features that affect text classification and therefore the accuracy of classifiers. The first feature considers the words that comprise the inputtext; here we use the N-gram concept with different N values. The second feature considers the adoption of textual pre-processing steps such as the stop-word filtering and stemming techniques. The third feature considers the classification algorithms hyperparameters. In this paper, we take the well-known classifiers K-Nearest Neighbors (KNN) and Naive Bayes (NB) where K (from KNN) and a-priori probabilities (from NB) are hyperparameters that influence accuracy. As a result, we explore the feature space (correlation among textual and cla ssifier aspects) and we present an approximation model that is able to predict classifiers accuracy. (More)

CC BY-NC-ND 4.0

Sign In Guest: Register as new SciTePress user now for free.

Sign In SciTePress user: please login.

PDF ImageMy Papers

You are not signed in, therefore limits apply to your IP address 74.48.170.251

In the current month:
Recent papers: 100 available of 100 total
2+ years older papers: 200 available of 200 total

Paper citation in several formats:
Aubakirov, S., Trigo, P. and Ahmed-Zaki, D. (2018). Distributed Optimization of Classifier Committee Hyperparameters. In Proceedings of the 7th International Conference on Data Science, Technology and Applications - DATA; ISBN 978-989-758-318-6; ISSN 2184-285X, SciTePress, pages 171-179. DOI: 10.5220/0006884101710179

@conference{data18,
author={Sanzhar Aubakirov and Paulo Trigo and Darhan Ahmed{-}Zaki},
title={Distributed Optimization of Classifier Committee Hyperparameters},
booktitle={Proceedings of the 7th International Conference on Data Science, Technology and Applications - DATA},
year={2018},
pages={171-179},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0006884101710179},
isbn={978-989-758-318-6},
issn={2184-285X},
}

TY - CONF

JO - Proceedings of the 7th International Conference on Data Science, Technology and Applications - DATA
TI - Distributed Optimization of Classifier Committee Hyperparameters
SN - 978-989-758-318-6
IS - 2184-285X
AU - Aubakirov, S.
AU - Trigo, P.
AU - Ahmed-Zaki, D.
PY - 2018
SP - 171
EP - 179
DO - 10.5220/0006884101710179
PB - SciTePress