loading
Papers Papers/2022 Papers Papers/2022

Research.Publish.Connect.

Paper

Paper Unlock

Authors: Elias Oliveira 1 ; Howard Roatti 1 ; Matheus de Araujo Nogueira 2 ; Henrique Gomes Basoni 1 and Patrick Marques Ciarelli 1

Affiliations: 1 Universidade Federal do Espírito Santo, Brazil ; 2 Fundação de Assistência e Educação FAESA, Brazil

Keyword(s): Text Classification, Social Network, Textmining.

Related Ontology Subjects/Areas/Topics: Artificial Intelligence ; Clustering and Classification Methods ; Computational Intelligence ; Concept Mining ; Evolutionary Computing ; Information Extraction ; Knowledge Discovery and Information Retrieval ; Knowledge-Based Systems ; Machine Learning ; Mining Text and Semi-Structured Data ; Soft Computing ; Symbolic Systems

Abstract: The usual practice in the classification problem is to create a set of labeled data for training and then use it to tune a classifier for predicting the classes of the remaining items in the dataset. However, labeled data demand great human effort, and classification by specialists is normally expensive and consumes a large amount of time. In this paper, we discuss how we can benefit from a cluster-based tree kNN structure to quickly build a training dataset from scratch. We evaluated the proposed method on some classification datasets, and the results are promising because we reduced the amount of labeling work by the specialists to 4% of the number of documents in the evaluated datasets. Furthermore, we achieved an average accuracy of 72.19% on tested datasets, versus 77.12% when using 90% of the dataset for training.

CC BY-NC-ND 4.0

Sign In Guest: Register as new SciTePress user now for free.

Sign In SciTePress user: please login.

PDF ImageMy Papers

You are not signed in, therefore limits apply to your IP address 74.48.170.251

In the current month:
Recent papers: 100 available of 100 total
2+ years older papers: 200 available of 200 total

Paper citation in several formats:
Oliveira, E. ; Roatti, H. ; Nogueira, M. ; Basoni, H. and Ciarelli, P. (2015). Using the Cluster-based Tree Structure of k-Nearest Neighbor to Reduce the Effort Required to Classify Unlabeled Large Datasets. In Proceedings of the 7th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management (IC3K 2015) - SSTM; ISBN 978-989-758-158-8; ISSN 2184-3228, SciTePress, pages 567-576. DOI: 10.5220/0005615305670576

@conference{sstm15,
author={Elias Oliveira and Howard Roatti and Matheus de Araujo Nogueira and Henrique Gomes Basoni and Patrick Marques Ciarelli},
title={Using the Cluster-based Tree Structure of k-Nearest Neighbor to Reduce the Effort Required to Classify Unlabeled Large Datasets},
booktitle={Proceedings of the 7th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management (IC3K 2015) - SSTM},
year={2015},
pages={567-576},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0005615305670576},
isbn={978-989-758-158-8},
issn={2184-3228},
}

TY - CONF

JO - Proceedings of the 7th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management (IC3K 2015) - SSTM
TI - Using the Cluster-based Tree Structure of k-Nearest Neighbor to Reduce the Effort Required to Classify Unlabeled Large Datasets
SN - 978-989-758-158-8
IS - 2184-3228
AU - Oliveira, E.
AU - Roatti, H.
AU - Nogueira, M.
AU - Basoni, H.
AU - Ciarelli, P.
PY - 2015
SP - 567
EP - 576
DO - 10.5220/0005615305670576
PB - SciTePress