skip to main content
10.1145/2393216.2393229acmotherconferencesArticle/Chapter ViewAbstractPublication PagesccseitConference Proceedingsconference-collections
research-article

Discretization in gene expression data analysis: a selected survey

Published: 26 October 2012 Publication History

Abstract

Discretization techniques are widely used as preprocessing task in different classification techniques specially in the area of machine learning. These techniques have also been used as a preprocessing task for computational construction of regulatory networks in gene expression data analysis. We analyze the use of some widely used discretization techniques in other gene expression data analysis tasks such as gene functional prediction. This paper evaluates the performance of these discretization techniques as a preprocessing task by applying the discretized gene expression data on different clustering algorithms. The results generated by the clustering algorithms are internally and externally validated against different discretization techniques. Finally, we introduce some of the important issues and research challenges.

References

[1]
J. Catlett. On changing continuous attributes into ordered discrete attributes. In Proceedings of the European Working Session on Machine Learning, pages 164--178, London, UK, 1991. Springer-Verlag.
[2]
D. L. Davies and D. W. Bouldin. A cluster separation measure. IEEE Transactions on Pattern Analysis and Machine Intelligence, PAMI-1(2):224--227, Apr. 1979.
[3]
J. Dougherty, R. Kohavi, and M. Sahami. Supervised and unsupervised discretization of continuous features. In ICML-95, 1995.
[4]
J. C. Dunn. A fuzzy relative of the ISODATA process and its use in detecting compact Well-Separated clusters. Journal of Cybernetics, 3(3):32--57, 1973.
[5]
S. Erdal, O. Ozturk, D. Armbruster, H. Ferhatosmanoglu, and W. C. Ray. A time series analysis of microarray data. Bioinformatic and Bioengineering, IEEE International Symposium on, 0:366, 2004.
[6]
M. Halkidi, M. Vazirgiannis, and Y. Batistakis. On clustering validation techniques. Journal of Intelligent Information Systems, 17(2-3):107--145, 2001.
[7]
L. Ji and K.-L. Tan. Mining gene expression data for positive and negative co-regulated gene clusters. Bioinformatics, 20:2711--2718, November 2004.
[8]
Y. Li, L. Liu, X. Bai, H. Cai, W. Ji, D. Guo, and Y. Zhu. Comparative study of discretization methods of microarray data for inferring transcriptional regulatory networks. BMC Bioinformatics, 11(1):520+, 2010.
[9]
J. B. MacQueen. Some methods for classification and analysis of multivariate observations. In L. M. L. Cam and J. Neyman, editors, Proc. of the fifth Berkeley Symposium on Mathematical Statistics and Probability, volume 1, pages 281--297. University of California Press, 1967.
[10]
S. C. Madeira and A. L. Oliveira. An evaluation of discretization methods for non-supervised analysis of time-series gene expression data. INESC-ID Technical Report 42/2005, 2005.
[11]
P. Rousseeuw. Silhouettes: A graphical aid to the interpretation and validation of cluster analysis. Journal of Computational and Applied Mathematics, 20(1):53--65, Nov. 1987.
[12]
R. Sharan and R. Shamir. Click: A clustering algorithm with applications to gene expression analysis. In ISMB'00, pages 307--316. AAAI Press, Menlo Park, CA.
[13]
K. Wang, B. Wang, and L. Peng. Cvap: Validation for cluster analyses. Data Science Journal, 8:88--93, 2009.

Cited By

View all

Index Terms

  1. Discretization in gene expression data analysis: a selected survey

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Other conferences
    CCSEIT '12: Proceedings of the Second International Conference on Computational Science, Engineering and Information Technology
    October 2012
    800 pages
    ISBN:9781450313100
    DOI:10.1145/2393216
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    • Avinashilingam University: Avinashilingam University

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 26 October 2012

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. discretization
    2. gene expression data analysis
    3. gene expression data preprocessing

    Qualifiers

    • Research-article

    Conference

    CCSEIT '12
    Sponsor:
    • Avinashilingam University

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)4
    • Downloads (Last 6 weeks)1
    Reflects downloads up to 06 Nov 2024

    Other Metrics

    Citations

    Cited By

    View all

    View Options

    Get Access

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media