International Journal of Computational Intelligence Systems

Volume 8, Issue 6, December 2015, Pages 1178 - 1191

DisCoSet: Discovery of Contrast Sets to Reduce Dimensionality and Improve Classification

Authors
Zaher Al Aghbari, Imran N. Junejo
Corresponding Author
Zaher Al Aghbari
Received 30 October 2014, Accepted 13 October 2015, Available Online 1 December 2015.
DOI
10.1080/18756891.2015.1113750How to use a DOI?
Keywords
Contrast sets, dimensionality reduction, classification, information retrieval, data mining
Abstract

Traditionally, contrast set mining aims at finding a set of rules that best distinguish the instances of different user-defined groups. Contrast sets are conjunctions of attribute-value pairs that are significantly more frequent in one group than in other groups. Typically, these contrast sets are extracted from categorical data or discretized numerical data. Existing methods of rule-based contrast sets require some user-defined thresholds to select the contrast sets. In this paper, we propose a greedy algorithm, called DisCoSet, to find incrementally a minimum set of local features that best distinguishes a class from other classes without resorting to discretization. The discovered contrast sets reduce the dimensionality of the feature vectors considerably and improve the classification accuracy significantly. We show that the proposed algorithm reduces the dimensionality of class instances by 40%-97% of the original length and yet improves classification accuracy by 10%-24% using different types of datasets.

Copyright
© 2017, the Authors. Published by Atlantis Press.
Open Access
This is an open access article distributed under the CC BY-NC license (http://creativecommons.org/licenses/by-nc/4.0/).

Download article (PDF)

Journal
International Journal of Computational Intelligence Systems
Volume-Issue
8 - 6
Pages
1178 - 1191
Publication Date
2015/12/01
ISSN (Online)
1875-6883
ISSN (Print)
1875-6891
DOI
10.1080/18756891.2015.1113750How to use a DOI?
Copyright
© 2017, the Authors. Published by Atlantis Press.
Open Access
This is an open access article distributed under the CC BY-NC license (http://creativecommons.org/licenses/by-nc/4.0/).

Cite this article

TY  - JOUR
AU  - Zaher Al Aghbari
AU  - Imran N. Junejo
PY  - 2015
DA  - 2015/12/01
TI  - DisCoSet: Discovery of Contrast Sets to Reduce Dimensionality and Improve Classification
JO  - International Journal of Computational Intelligence Systems
SP  - 1178
EP  - 1191
VL  - 8
IS  - 6
SN  - 1875-6883
UR  - https://doi.org/10.1080/18756891.2015.1113750
DO  - 10.1080/18756891.2015.1113750
ID  - AlAghbari2015
ER  -