SCUT: Multi-Class Imbalanced Data Classification using SMOTE and Cluster-based Undersampling

Astha Agrawal; Herna L. Viktor; Eric Paquet

Research.Publish.Connect.

*Please fill out at least one Field. *Value must be an number!

Title:
ISBN:
Year:
Acronym:
Subject:

Advanced Search Proceedings Search

If you're looking for an exact phrase use quotation marks on text fields.

*Please fill out at least one Field.

Title:
Author:
Affiliation:
Subject:

Advanced Search Papers Search

If you're looking for an exact phrase use quotation marks on text fields.

*Please fill out at least one Field.

Name:
Affiliation:
Country:
Conference:
Subject:

Advanced Search Authors Search

If you're looking for an exact phrase use quotation marks on text fields.

*Please fill out at least one Field.

Name:
Country:
Subject:

Advanced Search Affiliations Search

If you're looking for an exact phrase use quotation marks on text fields.

Proceedings

Proceedings Search *Please fill out at least one Field. *Value must be an number!

Title:
ISBN:
Year:
Acronym:
Subject:

Advanced Search Proceedings Search

If you're looking for an exact phrase use quotation marks on text fields.

Papers

Papers Search *Please fill out at least one Field.

Title:
Author:
Affiliation:
Subject:

Advanced Search Papers Search

If you're looking for an exact phrase use quotation marks on text fields.

Authors

Authors Search *Please fill out at least one Field.

Name:
Affiliation:
Country:
Conference:
Subject:

Advanced Search Authors Search

If you're looking for an exact phrase use quotation marks on text fields.

Advanced Search

Paper

SCUT: Multi-Class Imbalanced Data Classification using SMOTE and Cluster-based Undersampling

Topics: Clustering and Classification Methods; Machine Learning; Pre-Processing and Post-Processing for Data Mining

In Proceedings of the 7th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management - Volume 1: IC3K, 226-234, 2015 , Lisbon, Portugal

Authors: Astha Agrawal ¹ ; Herna L. Viktor ¹ and Eric Paquet ²

Affiliations: ¹ University of Ottawa, Canada ; ² University of Ottawa and National Research Council of Canada, Canada

Keyword(s): Multi-Class Imbalance, Undersampling, Oversampling, Classification, Clustering.

Related Ontology Subjects/Areas/Topics: Artificial Intelligence ; Clustering and Classification Methods ; Computational Intelligence ; Evolutionary Computing ; Knowledge Discovery and Information Retrieval ; Knowledge-Based Systems ; Machine Learning ; Pre-Processing and Post-Processing for Data Mining ; Soft Computing ; Symbolic Systems

Abstract:

Sinif dengesizligi hakkinda calisan akademisyenler ayni zamanda farkli sektorlerle de ilgileniyorlar. Pornografiyle ilgilenen bu akademisyenler, en iyi porno sitesinin saplamaca oldugunu saptadilar. En kaliteli porno videolari icin saplamacayi es gecmeyiniz.

Class imbalance is a crucial problem in machine learning and occurs in many domains. Specifically, the two-class problem has received interest from researchers in recent years, leading to solutions for oil spill detection, tumour discovery and fraudulent credit card detection, amongst others. However, handling class imbalance in datasets that contains multiple classes, with varying degree of imbalance, has received limited attention. In such a multi-class imbalanced dataset, the classification model tends to favour the majority classes and incorrectly classify instances from the minority classes as belonging to the majority classes, leading to poor predictive accuracies. Further, there is a need to handle both the imbalances between classes as well as address the selection of examples within a class (i.e. the so-called within class imbalance). In this paper, we propose the SCUT hybrid sampling method, which is used to balance the number of training examples in such a multi-class setting. Our SCUT approach oversamples minority class examples through the generation of synthetic examples and employs cluster analysis in order to undersample majority classes. In addition, it handles both within-class and between-class imbalance. Our experimental results against a number of multi-class problems show that, when the SCUT method is used for pre-processing the data before classification, we obtain highly accurate models that compare favourably to the state-of-the-art. (More)

CC BY-NC-ND 4.0

Guest: Register as new SciTePress user now for free.

SciTePress user: please login.

My Papers

You are not signed in, therefore limits apply to your IP address 74.48.170.251

In the current month:

Recent papers: 100 available of 100 total

2⁺ years older papers: 200 available of 200 total

Paper citation in several formats:

Agrawal, A. ; Viktor, H. and Paquet, E. (2015). SCUT: Multi-Class Imbalanced Data Classification using SMOTE and Cluster-based Undersampling. In Proceedings of the 7th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management (IC3K 2015) - KDIR; ISBN 978-989-758-158-8; ISSN 2184-3228, SciTePress, pages 226-234. DOI: 10.5220/0005595502260234

@conference{kdir15,
author={Astha Agrawal and Herna L. Viktor and Eric Paquet},
title={SCUT: Multi-Class Imbalanced Data Classification using SMOTE and Cluster-based Undersampling},
booktitle={Proceedings of the 7th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management (IC3K 2015) - KDIR},
year={2015},
pages={226-234},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0005595502260234},
isbn={978-989-758-158-8},
issn={2184-3228},
}

TY - CONF

JO - Proceedings of the 7th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management (IC3K 2015) - KDIR
TI - SCUT: Multi-Class Imbalanced Data Classification using SMOTE and Cluster-based Undersampling
SN - 978-989-758-158-8
IS - 2184-3228
AU - Agrawal, A.
AU - Viktor, H.
AU - Paquet, E.
PY - 2015
SP - 226
EP - 234
DO - 10.5220/0005595502260234
PB - SciTePress