A new cluster validity measure for bioinformatics relational datasets

M Popescu, JC Bezdek, JM Keller… - … on Fuzzy Systems …, 2008 - ieeexplore.ieee.org
2008 IEEE International Conference on Fuzzy Systems (IEEE World …, 2008ieeexplore.ieee.org
Many important applications in biology have underlying datasets that are relational, that is,
only the (dis) similarity between biological objects (amino acid sequences, gene expression
profiles, etc.) is known and not their feature values in some feature space. Examples of such
relational datasets are the gene similarity matrices obtained from BLAST, gene expression
data, or gene ontology (GO) similarity measures. Once a relational dataset is obtained, a
common question asked is how many groups of objects are represented in the original …
Many important applications in biology have underlying datasets that are relational, that is, only the (dis)similarity between biological objects (amino acid sequences, gene expression profiles, etc.) is known and not their feature values in some feature space. Examples of such relational datasets are the gene similarity matrices obtained from BLAST, gene expression data, or gene ontology (GO) similarity measures. Once a relational dataset is obtained, a common question asked is how many groups of objects are represented in the original dataset. The answer to this question is usually obtained by employing a clustering algorithm and a cluster validity measure. In this article we describe a cluster validity measure for non-Euclidean relational fuzzy c-means that is based on the correlation between a relation induced on the data by the cluster memberships and the original relational data. This validity measure can be applied to partitions made by any fuzzy relational clustering algorithm. We illustrate our measure by validating clusters in several dissimilarity matrices for a set of 194 gene products obtained using BLAST and GO similarities.
ieeexplore.ieee.org
Showing the best result for this search. See all results