Introduction to the Domain-Driven Data Mining Special Section
Summary form only given. In the last decade, data mining has emerged as one of the most dynamic and lively areas in information technology. Although many algorithms and techniques for data mining have been proposed, they either focus on domain ...
Domain-Driven Data Mining: Challenges and Prospects
Traditional data mining research mainly focus]es on developing, demonstrating, and pushing the use of specific algorithms and models. The process of data mining stops at pattern identification. Consequently, a widely seen fact is that 1) many algorithms ...
Bridging Domains Using World Wide Knowledge for Transfer Learning
A major problem of classification learning is the lack of ground-truth labeled data. It is usually expensive to label new data instances for training a model. To solve this problem, domain adaptation in transfer learning has been proposed to classify ...
Knowledge-Based Interactive Postmining of Association Rules Using Ontologies
In Data Mining, the usefulness of association rules is strongly limited by the huge amount of delivered rules. To overcome this drawback, several methods were proposed in the literature such as itemset concise representations, redundancy reduction, and ...
Logic-Based Pattern Discovery
In the data mining field, association rules are discovered having domain knowledge specified as a minimum support threshold. The accuracy in setting up this threshold directly influences the number and the quality of association rules discovered. Often, ...
Asking Generalized Queries to Domain Experts to Improve Learning
With the assistance of a domain expert, active learning can often select or construct fewer examples to request their labels to build an accurate classifier. However, previous works of active learning can only generate and ask specific queries. In real-...
Domain-Driven Classification Based on Multiple Criteria and Multiple Constraint-Level Programming for Intelligent Credit Scoring
Extracting knowledge from the transaction records and the personal data of credit card holders has great profit potential for the banking industry. The challenge is to detect/predict bankrupts and to keep and recruit the profitable customers. However, ...
Signaling Potential Adverse Drug Reactions from Administrative Health Databases
The work is motivated by real-world applications of detecting Adverse Drug Reactions (ADRs) from administrative health databases. ADRs are a leading cause of hospitalization and death worldwide. Almost all current postmarket ADR signaling techniques are ...
Feature Selection Using f-Information Measures in Fuzzy Approximation Spaces
The selection of nonredundant and relevant features of real-valued data sets is a highly challenging problem. A novel feature selection method is presented here based on fuzzy-rough sets by maximizing the relevance and minimizing the redundancy of the ...
δ-Presence without Complete World Knowledge
Advances in information technology, and its use in research, are increasing both the need for anonymized data and the risks of poor anonymization. In [CHECK END OF SENTENCE], we presented a new privacy metric, \delta-presence, that clearly links the ...
Privacy-Preserving Gradient-Descent Methods
Gradient descent is a widely used paradigm for solving many optimization problems. Gradient descent aims to minimize a target function in order to reach a local minimum. In machine learning or data mining, this function corresponds to a decision model ...
Dynamic Dissimilarity Measure for Support-Based Clustering
Clustering methods utilizing support estimates of a data distribution have recently attracted much attention because of their ability to generate cluster boundaries of arbitrary shape and to deal with outliers efficiently. In this paper, we propose a ...
Kernel Discriminant Learning for Ordinal Regression
Ordinal regression has wide applications in many domains where the human evaluation plays a major role. Most current ordinal regression methods are based on Support Vector Machines (SVM) and suffer from the problems of ignoring the global information of ...