skip to main content
10.1145/2905055.2905182acmotherconferencesArticle/Chapter ViewAbstractPublication PagesictcsConference Proceedingsconference-collections
research-article

A Fast Gene Expression Analysis using Parallel Biclustering and Distributed Triclustering Approach

Published: 04 March 2016 Publication History

Abstract

Biclustering or simultaneous clustering aims to mine rowise and columwise a G×S dataset into groups of genes coexpressed across a subset of conditions. Triclustering is a recent advancement in unsupervised learning, which groups genes under a subset of conditions and time points over G×S×T plane. With the growing size of data, the cost of tricluster extraction becomes too high, which demands for a cost-effective triclustering method by distributing the computational load is needed for obtaining optimal results. This paper presents a fast shared memory biclustering and shared nothing triclustering analysis architecture to analyze gene expression data to identify coexpressed patterns of high biological significance over GxSxT plane. The proposed triclustering approach has been found able to identify shifted, scaled, and shifted-and-scaled coexpressed patterns at minimum cost over several benchmark datasets.

References

[1]
R. Agrawal, J. Gehrke, D. Gunopulos, and P. Raghavan. Automatic subspace clustering of high dimensional data. Data Mining and Knowledge Discovery, 11(1):5--33, 2005.
[2]
H. Ahmed, P. Mahanta, D. Bhattacharyya, and J. Kalita. Shifting-and-scaling correlation based biclustering algorithm. IEEE/ACM Transactions, Computational Biology and Bioinformatics, 11(6):1239--1252, Nov 2014.
[3]
H. Ahmed, P. Mahanta, D. Bhattacharyya, J. Kalita, and A. Ghosh. Intersected coexpressed subcube miner: An effective triclustering algorithm. In WICT, 2011, pages 846--851. IEEE, 2011.
[4]
R. Araujo, G. Trielli, G. Orair, W. Meira, R. Ferreira, and D. Guedes. Partricluster: a scalable parallel algorithm for gene expression analysis. In 18TH ISBAC-PAD'06, pages 3--10. IEEE, 2006.
[5]
G. F. Berriz, O. D. King, B. Bryant, C. Sander, and F. P. Roth. Characterizing gene sets with funcassociate. Bioinformatics, 19(18):2502--2504, 2003.
[6]
A. Bhar, M. Haubrock, A. Mukhopadhyay, U. Maulik, S. Bandyopadhyay, and E. Wingender. Δ-trimax: extracting triclusters and analysing coregulation in time series gene expression data. In Algorithms in Bioinformatics, pages 165--177. Springer, 2012.
[7]
A. Bhar, M. Haubrock, A. Mukhopadhyay, and E. Wingender. Multiobjective triclustering of time-series transcriptome data reveals key genes of biological processes. BMC bioinformatics, 16(1):200, 2015.
[8]
D. K. Bhattacharyya and J. K. Kalita. Network anomaly detection: A machine learning perspective. CRC Press, 2013.
[9]
Y. Cheng and G. M. Church. Biclustering of expression data. In Intelligent Systems for Molecular Biology, volume 8, pages 93--103, 2000.
[10]
M. B. Eisen, P. T. Spellman, P. O. Brown, and D. Botstein. Cluster analysis and display of genome-wide expression patterns. Proceedings of the National Academy of Sciences, 95(25):14863--14868, 1998.
[11]
J. A. Hartigan. Clustering Algorithms. John Wiley & Sons, Inc., New York, NY, USA, 99th edition, 1975.
[12]
Z. Hu and R. Bhatnagar. Algorithm for discovering low-variance 3-clusters from real-valued datasets. In IEEE 10th ICDM, 2010, pages 236--245. IEEE, 2010.
[13]
D. Jiang, J. Pei, M. Ramanathan, C. Tang, and A. Zhang. Mining coherent gene clusters from gene-sample-time microarray data. In Proceedings of the tenth ACM SIGKDD, pages 430--439. ACM, 2004.
[14]
H. Jiang, S. Zhou, J. Guan, and Y. Zheng. gtricluster: a more general and effective 3d clustering algorithm for gene-sample-time microarray data. In Data Mining for Biomedical Applications, pages 48--59. Springer, 2006.
[15]
Y. Liu, T. Yang, and L. Fu. A partitioning based algorithm to fuzzy tricluster. Mathematical Problems in Engineering, 2015, 2015.
[16]
Y.-C. Liu, C.-H. Lee, W.-C. Chen, J. Shin, H.-H. Hsu, and V. S. Tseng. A novel method for mining temporally dependent association rules in three-dimensional microarray datasets. In Computer Symposium (ICS), 2010 International, pages 759--764. IEEE, 2010.
[17]
C. F. Olson. Parallel algorithms for hierarchical clustering. Parallel computing, 21(8):1313--1325, 1995.
[18]
J. Orlin. Contentment in graph theory: covering graphs with cliques. In Indagationes Mathematicae (Proceedings), volume 80, pages 406--424. Elsevier, 1977.
[19]
A. Prelić, S. Bleuler, P. Zimmermann, A. Wille, P. Bühlmann, W. Gruissem, L. Hennig, L. Thiele, and E. Zitzler. A systematic comparison and evaluation of biclustering methods for gene expression data. Bioinformatics, 22(9):1122--1129, 2006.
[20]
K. Sim, Z. Aung, and V. Gopalkrishnan. Discovering correlated subspace clusters in 3d continuous-valued data. In IEEE 10th ICDM, 2010, pages 471--480, Dec 2010.
[21]
P. T. Spellman, G. Sherlock, M. Q. Zhang, V. R. Iyer, K. Anders, M. B. Eisen, P. O. Brown, D. Botstein, and B. Futcher. Comprehensive identification of cell cycle--regualted genes of the yeast saccharomyces cerevisiae by microarray hybridization. Molecular Biology of the cell, 9(12):3273--3297, 1998.
[22]
A. Tanay, R. Sharan, M. Kupiec, and R. Shamir. Revealing modularity and organization in the yeast molecular network by integrated analysis of highly heterogeneous genomewide data. Proceedings of the National Academy of Sciences of the United States of America, 101(9):2981--2986, 2004.
[23]
G. Wang, L. Yin, Y. Zhao, and K. Mao. Efficiently mining time-delayed gene expression patterns. IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics, 40(2):400--411, 2010.
[24]
X. Xu, Y. Lu, K.-L. Tan, and A. Tung. Finding time-lagged 3d clusters. In IEEE 25th ICDE 2009., pages 445--456. IEEE, 2009.
[25]
X. Xu, Y. Lu, A. Tung, and W. Wang. Mining shifting-and-scaling co-regulation patterns on gene expression profiles. In Proceedings of the 22nd ICDE 2006., pages 89--89. IEEE, 2006.
[26]
Y. Yin, Y. Zhao, B. Zhang, and G. Wang. Mining time-shifting co-regulation patterns from gene expression data. In Advances in data and web management, pages 62--73. Springer, 2007.
[27]
T. Yun and G.-S. Yi. Biclustering for the comprehensive search of correlated gene expression patterns using clustered seed expansion. BMC genomics, 14(1):144, 2013.
[28]
L. Zhao and M. J. Zaki. Tricluster: an effective algorithm for mining coherent clusters in 3d microarray data. In Proceedings of the 2005 ACM SIGMOD international conference on Management of data, pages 694--705. ACM, 2005.
[29]
W. Zhao, H. Ma, and Q. He. Parallel k-means clustering based on mapreduce. In Cloud Computing, pages 674--679. Springer, 2009.
[30]
J. Zhou and A. Khokhar. Parrescue: Scalable parallel algorithm and implementation for biclustering over large distributed datasets. In 26th IEEE ICDCS 2006., pages 21--21, 2006.

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences
ICTCS '16: Proceedings of the Second International Conference on Information and Communication Technology for Competitive Strategies
March 2016
843 pages
ISBN:9781450339629
DOI:10.1145/2905055
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 04 March 2016

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Distributed
  2. biclusters
  3. coexpressed
  4. shifting-and-scaling patterns
  5. triclusters

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Conference

ICTCS '16

Acceptance Rates

Overall Acceptance Rate 97 of 270 submissions, 36%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)5
  • Downloads (Last 6 weeks)0
Reflects downloads up to 06 Nov 2024

Other Metrics

Citations

Cited By

View all

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media