Export Citations
Save this search
Please login to be able to save your searches and receive alerts for new content matching your search criteria.
- ArticleAugust 2006
Is there a grand challenge or X-prize for data mining?
KDD '06: Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data miningPages 954–956https://doi.org/10.1145/1150402.1150535This panel will discuss possible exciting and motivating Grand Challenge problems for Data Mining, focusing on bioinformatics, multimedia mining, link mining, text mining, and web mining.
- ArticleAugust 2006
Beyond classification and ranking: constrained optimization of the ROI
KDD '06: Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data miningPages 948–953https://doi.org/10.1145/1150402.1150533Classification has been commonly used in many data mining projects in the financial service industry. For instance, to predict collectability of accounts receivable, a binary class label is created based on whether a payment is received within a certain ...
- ArticleAugust 2006
Camouflaged fraud detection in domains with complex relationships
KDD '06: Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data miningPages 941–947https://doi.org/10.1145/1150402.1150532We describe a data mining system to detect frauds that are camouflaged to look like normal activities in domains with high number of known relationships. Examples include accounting fraud detection for rating and investment, insider attacks on corporate ...
- ArticleAugust 2006
Discovering significant OPSM subspace clusters in massive gene expression data
KDD '06: Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data miningPages 922–928https://doi.org/10.1145/1150402.1150529Order-preserving submatrixes (OPSMs) have been accepted as a biologically meaningful subspace cluster model, capturing the general tendency of gene expressions across a subset of conditions. In an OPSM, the expression levels of all genes induce the same ...
- ArticleAugust 2006
A component-based framework for knowledge discovery in bioinformatics
KDD '06: Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data miningPages 916–921https://doi.org/10.1145/1150402.1150528Motivation: In the field of bioinformatics there is an emerging need to integrate all knowledge discovery steps into a standardized modular framework. Indeed, component-based development can significantly enhance reusability and productivity for short ...
-
- ArticleAugust 2006
Mining citizen science data to predict orevalence of wild bird species
- Rich Caruana,
- Mohamed Elhawary,
- Art Munson,
- Mirek Riedewald,
- Daria Sorokina,
- Daniel Fink,
- Wesley M. Hochachka,
- Steve Kelling
KDD '06: Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data miningPages 909–915https://doi.org/10.1145/1150402.1150527The Cornell Laboratory of Ornithology's mission is to interpret and conserve the earth's biological diversity through research, education, and citizen science focused on birds. Over the years, the Lab has accumulated one of the largest and longest-...
- ArticleAugust 2006
Opportunity map: identifying causes of failure - a deployed data mining system
KDD '06: Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data miningPages 892–901https://doi.org/10.1145/1150402.1150524In this paper, we report a deployed data mining application system for Motorola. Originally, its intended use was for identifying causes of cellular phone failures, but it has been found to be useful for many other engineering data sets as well. For ...
- ArticleAugust 2006
Understandable models Of music collections based on exhaustive feature generation with temporal statistics
KDD '06: Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data miningPages 882–891https://doi.org/10.1145/1150402.1150523Data mining in large collections of polyphonic music has recently received increasing interest by companies along with the advent of commercial online distribution of music. Important applications include the categorization of songs into genres and the ...
- ArticleAugust 2006
GPLAG: detection of software plagiarism by program dependence graph analysis
KDD '06: Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data miningPages 872–881https://doi.org/10.1145/1150402.1150522Along with the blossom of open source projects comes the convenience for software plagiarism. A company, if less self-disciplined, may be tempted to plagiarize some open source projects for its own products. Although current plagiarism detection tools ...
- ArticleAugust 2006
Mining for proposal reviewers: lessons learned at the national science foundation
KDD '06: Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data miningPages 862–871https://doi.org/10.1145/1150402.1150521In this paper, we discuss a prototype application deployed at the U.S. National Science Foundation for assisting program directors in identifying reviewers for proposals. The application helps program directors sort proposals into panels and find ...
- ArticleAugust 2006
Pragmatic text mining: minimizing human effort to quantify many issues in call logs
KDD '06: Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data miningPages 852–861https://doi.org/10.1145/1150402.1150520We discuss our experiences in analyzing customer-support issues from the unstructured free-text fields of technical-support call logs. The identification of frequent issues and their accurate quantification is essential in order to track aggregate costs ...
- ArticleAugust 2006
Data mining challenges in the automotive domain
KDD '06: Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data miningPage 836https://doi.org/10.1145/1150402.1150516Automotive companies, such as Ford Motor Company, have no shortage of large databases with abundant opportunities for cost reduction and revenue enhancement. The Data Mining Group at Ford has worked in the areas of Quality, Customer Satisfaction and ...
- ArticleAugust 2006
Information extraction, data mining and joint inference
KDD '06: Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data miningPage 835https://doi.org/10.1145/1150402.1150515Although information extraction and data mining appear together in many applications, their interface in most current systems would better be described as serial juxtaposition than as tight integration. Information extraction populates slots in a ...
- ArticleAugust 2006
Capital One's statistical problems: our top ten list
KDD '06: Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data miningPage 834https://doi.org/10.1145/1150402.1150514Capital One is a highly quantitatively driven diversified financial services firm. As such, we make broad and deep use of the entire repertory of highly quantitative techniques. This talk will present our top ten statistical problems. Indeed, one of ...
- ArticleAugust 2006
Introducing perpetual analytics
KDD '06: Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data miningPage 833https://doi.org/10.1145/1150402.1150513Common strategies to liberate an organization's information assets for situational awareness frequently rely on infrastructure components such as data integration, enterprise search, federation, data warehousing, and so on. And while these traditional ...
- ArticleAugust 2006
BLOSOM: a framework for mining arbitrary boolean expressions
KDD '06: Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data miningPages 827–832https://doi.org/10.1145/1150402.1150511We introduce a novel framework, called BLOSOM, for mining (frequent) boolean expressions over binary-valued datasets. We organize the space of boolean expressions into four categories: pure conjunctions, pure disjunctions, conjunction of disjunctions, ...
- ArticleAugust 2006
Identifying bridging rules between conceptual clusters
KDD '06: Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data miningPages 815–820https://doi.org/10.1145/1150402.1150509A bridging rule in this paper has its antecedent and action from different conceptual clusters. We first design two algorithms for mining bridging rules between clusters in a database, and then propose two non-linear metrics for measuring the ...
- ArticleAugust 2006
Mining progressive confident rules
KDD '06: Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data miningPages 803–808https://doi.org/10.1145/1150402.1150507Many real world objects have states that change over time. By tracking the state sequences of these objects, we can study their behavior and take preventive measures before they reach some undesirable states. In this paper, we propose a new kind of ...
- ArticleAugust 2006
Coherent closed quasi-clique discovery from large dense graph databases
KDD '06: Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data miningPages 797–802https://doi.org/10.1145/1150402.1150506Frequent coherent subgraphs can provide valuable knowledge about the underlying internal structure of a graph database, and mining frequently occurring coherent subgraphs from large dense graph databases has been witnessed several applications and ...
- ArticleAugust 2006
Integration of semantic-based bipartite graph representation and mutual refinement strategy for biomedical literature clustering
KDD '06: Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data miningPages 791–796https://doi.org/10.1145/1150402.1150505We introduce a novel document clustering approach that overcomes those problems by combining a semantic-based bipartite graph representation and a mutual refinement strategy. The primary contributions of this paper are the following. First, we introduce ...