skip to main content
10.1145/1982185.1982209acmconferencesArticle/Chapter ViewAbstractPublication PagessacConference Proceedingsconference-collections
research-article

Classifying microarray data with association rules

Published: 21 March 2011 Publication History

Abstract

In this paper we investigate a method for classifying microarray data using association rules. Associative classifiers, classification systems based on association rules, show good performance level while being easy to read and understand. This feature is especially attractive for biological data where experts can read and validate the association rules. Relevant features are selected using Support Vector Machines with Recursive Feature Elimination. These features are discretized according to their relative expression levels (upregulated, downregulated or no change) and then they are used to build an associative classifier. The proposed combination proves highly accurate for the studied microarray data collection. In addition the classification rules discovered and employed in the classification process prove to be biologically relevant.

References

[1]
R. Agrawal et al. Mining association rules between sets of items in large databases. In Proc. of SIGMOD, pages 207--216, 1993.
[2]
M.-L. Antonie and O. R. Zaïane. Text document categorization by term association. In Proc. of ICDM, pages 19--26, 2002.
[3]
M.-L. Antonie et al. Learning to use a learned model: A two-stage approach to classification. In Proc. of ICDM, 2006.
[4]
R. Bayardo. Brute-force mining of high-confidence classification rules. In Proc. of SIGKDD, pages 123--126, 1997.
[5]
B. Bolstad et al. A comparison of normalization methods for high density oligonucleotide array data based on variance and bias. Bioinformatics, 19(2): 185--193, 2003.
[6]
C. Borgelt. Carpenter software. http://fuzzy.cs.unimagdeburg.de/~borgelt/software.html.
[7]
A. Corcoran et al. Impaired immunoglobulin gene rearrangement in mice lacking the il-7 receptor. Nature, 391(6670): 904--907, 1998.
[8]
I. Desbaillets et al. Upregulation of interleukin 8 by oxygen-deprived cells in glioblastoma suggests a role in leukocyte activation, chemotaxis, and angiogenesis. J. Exp. Med, 186(8): 1201--1212, 1997.
[9]
M. B. Eisen et al. Cluster analysis and display of genome-wide expression patterns. Proc Natl Acad Sci USA, 95(25): 14863--14868, 1998.
[10]
C. Frias et al. Telomere shortening is associated with poor prognosis and telomerase activity correlates with dna repair impairment in non-small cell lung cancer. Lung Cancer, 60(3): 416--425, 2008.
[11]
T. Furukawa et al. Potential tumor suppressive pathway involving dusp6/mkp-3 in pancreatic cancer. Am J Pathol, 162(6): 1807--1815, 2003.
[12]
B. Goethals and M. Zaki, editors. FIMI'03: Workshop on Frequent Itemset Mining Implementations, volume 90 of CEUR Workshop Proc. series, 2003.
[13]
T. R. Golub et al. Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. Science, 286: 531--537, 1999.
[14]
I. Guyon et al. Gene selection for cancer classification using support vector machines. Machine Learning, 46: 389--422, 2002.
[15]
M. Hibbs et al. Sustained activation of lyn tyrosine kinase in vivo leads to autoimmunity. J Exp Med., 196(12): 1593--604, 2002.
[16]
J. Hukkanen et al. Metabolism and disposition kinetics of nicotine. Pharmacol Rev., 57(1): 79--115, 2005.
[17]
R. Irizarry et al. Exploration, normalization, and summaries of high density oligonucleotide array probe level data. Biostatistics, 4(2): 249--264, 2003.
[18]
J. Jaeger and R. Sengupta. Improved gene selection for classification of microarrays. In Pac Symp Biocomput, pages 53--64, 2003.
[19]
W. Li et al. CMAR: Accurate and efficient classification based on multiple class-association rules. In Proc. of ICDM, pages 369--376, 2001.
[20]
B. Liu et al. Integrating classification and association rule mining. In Proc. of SIGKDD, pages 80--86, 1998.
[21]
K. Maeda et al. Brain specific human genes, nell1 and nell2, are predominantly expressed in neuroblastoma and other embryonal neuroepithelial tumors. Neurol Med Chir (Tokyo), 41(12): 582--588, 2001.
[22]
T. Matsunaga et al. Regulation of annexin ii by cytokine-initiated signaling pathways and e2a-hlf oncoprotein. Blood, 103(8): 3185--3191, 2003.
[23]
S. Metkar et al. Cytotoxic cell granule-mediated apoptosis: perforin delivers granzyme b-serglycin complexes into target cells without plasma membrane pore formation. Immunity, 16(3): 417--428, 2002.
[24]
H. Nishizumi et al. Impaired proliferation of peripheral b cells and indication of autoimmune disease in lyn-deficient mice. Immunity, 3(5): 549--560, 1995.
[25]
F. Pan et al. Carpenter: finding closed patterns in long biological datasets. In KDD '03: Proc. of SIGKDD, pages 637--642, 2003.
[26]
J. Quackenbush. Microarray data normalization and transformation. Nat Genet., 32: 496--501, 2002.
[27]
J. Rangel et al. Novel role for rgs1 in melanoma progression. Am J Surg Pathol, 32(8): 1207--1212, 2008.
[28]
D. Shalon et al. A dna microarray system for analyzing complex dna samples using twocolor fluorescent probe hybridization. Genome Res., 6(7): 639--645, 1996.
[29]
R. Sharan and R. Shamir. Click: A clustering algorithm with applications to gene expression analysis. In Proc. of 8th Intl. Conf. Intelligent Systems for Molecular Biology, pages 307--316, 2000.
[30]
A. Spira et al. Airway epithelial gene expression in the diagnostic evaluation of smokers with suspect lung cancer. Nat Med., 13(3): 361--366, 2007.
[31]
P. Tamayo et al. Interpreting patterns of gene expression with self-organizing maps: Methods and application to hematopoietic differentiation. Proceedings of the National Academy of Sciences, USA, 96: 2907--2912, 1999.
[32]
R. D. C. Team. R: A language and environment for statistical computing. R Foundation for Statistical Computing, 2009.
[33]
C. Vitale et al. Engagement of p75/airm1 or cd33 inhibits the proliferation of normal or leukemic myeloid cells. Proc. Natl. Acad. Sci. U. S. A., 96(26): 15091--15096, 1999.

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
SAC '11: Proceedings of the 2011 ACM Symposium on Applied Computing
March 2011
1868 pages
ISBN:9781450301138
DOI:10.1145/1982185
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 21 March 2011

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. association rules
  2. classification
  3. microarray data

Qualifiers

  • Research-article

Conference

SAC'11
Sponsor:
SAC'11: The 2011 ACM Symposium on Applied Computing
March 21 - 24, 2011
TaiChung, Taiwan

Acceptance Rates

Overall Acceptance Rate 1,650 of 6,669 submissions, 25%

Upcoming Conference

SAC '25
The 40th ACM/SIGAPP Symposium on Applied Computing
March 31 - April 4, 2025
Catania , Italy

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)9
  • Downloads (Last 6 weeks)2
Reflects downloads up to 28 Jan 2025

Other Metrics

Citations

Cited By

View all

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media