skip to main content
article

INCREMENTAL EXTRACTION OF ASSOCIATION RULES IN APPLICATIVE DOMAINS

Published: 01 April 2007 Publication History

Abstract

In recent years, the KDD process has been advocated to be an iterative and interactive process. It is seldom the case that a user is able to answer immediately all his questions on date with a single query. On the contrary, the work-flow of the typical user consists of several steps in which he/she iteratively refines the extracted knowledge by inspecting previous results and posing new queries. Given this view of the KDD process, in order to reduce the computational effort, it becomes crucial to have KDD systems that are able to exploit past results. This is especially true in environments in which the system knowledge base is the result of many discoveries on data made separately by the collaborative effort of different users. In this paper, we consider the problem of mining frequent association rules from database relations. We first model a general, constraint-based, mining language for this task. Then, we propose an algorithm that answers such queries reusing past results. In particular, this solution is effective for a new class of constraints, called context dependent, which are more difficult than the traditionally studied item dependent constraints. Nevertheless, we show that some typical queries of important application domains, such as market stock trading, analysis of web log, and gene microarrays in bioinformatics, have context-dependent constraints. We show with a set of experiments in these application domains that the proposed solution with an incremental approach is both effective and viable.

References

[1]
Agarwal, R. C., Aggarwal, C. C. and Prasad, V. V. V. (2001) A tree projection algorithm for generation of frequent item sets. Journal of Parallel Distrib. Comput., p. 61.
[2]
Agrawal, R., Mannila, H., Srikant, R., Toivonen, H. and Verkamo, A. I. (1996) Advances in Knowledge Discovery and Data Mining, 2, pp. 307-328. AAAI/MIT Press, London
[3]
Agrawal, R. and Srikant, R. (1994) Fast algorithms for mining association rules in large databases. Proceedings of the 20th VLDB Conference, pp. 487-499.
[4]
Ayan, N. F., Tansel, A. U. and Arkun, E. (1999) An efficient algorithm to update large itemsets with early pruning. Proceedings of the Fifth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 287-291.
[5]
Bayardo, R., Agrawal, R. and Gunopulos, D. (1999) Constraint-based rule mining in large, dense databases. Proceedings of the 15th Int'l Conf. on Data Engineering, pp. 188-197.
[6]
Botta, M., Esposito, R., Gallo, A. and Meo, R. (2004) Rt79-2004: Optimizing Inductive Queries in Frequent Itemset Mining., Università di Torino
[7]
Bucila, C., Gehrke, J., Kifer, D. and White, W. M. (2002) Dualminer: A dual-pruning algorithm for itemsets with constraints. Proceedings of 2002 ACM Knowledge Discovery and Data mining, pp. 42-51.
[8]
Chaudhuri, S., Narasayya, V. and Sarawagi, S. (2002) Efficient evaluation of queries with mining predicates. Proc. of the 18th Int'l Conference on Data Engineering (ICDE), pp. 1-125.
[9]
Cheung, D. W., Han, J., Ng, V. T. and Wong, C. Y. (1996) Maintenance of discovered association rules in large databases: An incremental updating technique. Proceeding of the Conference on Data Engineering
[10]
Fang, M., Shivakumar, N., Garcia-Molina, H., Motwani, R. and Ullman, J. Computing iceberg queries efficiently. Proceeding of Very Long Databases '98
[11]
Feng, L., Dillon, T. S. and Liu, J. (2001) Inter-transactional association rules for multi-dimensional contexts for prediction and their application to studying meteorological data. Data Knowledge Engineering, 37, pp. 85-115.
[12]
Han, J., Fu, Y., Wang, W., Koperski, K. and Zaiane, O. (1996) Proc. of SIGMOD-96 Workshop on Research Issues on Data Mining and Knowledge Discovery, Montreal, Canada
[13]
Han, J., Pei, J. and Yin, Y. (2000) Proc. of ACM SIGMOD 2000, pp. 1-12. Dallas, Texas, USA
[14]
Imielinski, T. and Mannila, H. (1996) A database perspective on knowledge discovery. Communications of the ACM, 39, pp. 58-64.
[15]
Imielinski, T., Virmani, A. and Abdoulghani, A. (1996) Datamine: Application programming interface and query language for database mining. Knowledge Discovery and Datamining 96, pp. 256-260.
[16]
Labio, W., Yang, J., Cui, Y., Garcia-Molina, H. and Widom, J. (2000) Performance issues in incremental warehouse maintenance. Proceedings of Twenty-Sixth International Conference on Very Large Data Bases, pp. 461-472.
[17]
Lakshmanan, L. V. S., Ng, R., Han, J. and Pang, A. (1999) Optimization of constrained frequent set queries with 2-variable constraints. Proceedings of 1999 ACM SIGMOD Int. Conf. Management of Data, pp. 157-168.
[18]
Leung, C. K. S., Lakshmanan, L. V. S. and Ng, R. T. (2002) Exploiting succinct constraints using fp-trees. ACM SIGKDD Explorations, 4, pp. 40-49.
[19]
Meo, R., Botta, M. and Esposito, R. (2004) Query rewriting in itemset mining. Proceedings of the 6th International Conference On Flexible Query Answeringd Systems LNAI, Springer, pp. 111-124. LNAI, Springer
[20]
Meo, R., Psaila, G. and Ceri, S. (1996) A new SQL-like operator for mining association rules. Proceedings of the 22st VLDB Conference, pp. 122-133.
[21]
Ng, R.T., Lakshmanan, L. V. S., Han, J. and Pang, A. (1998) Exploratory mining and pruning optimizations of constrained associations rules. Proc. of 1998 ACM SIGMOD Int. Conf., pp. 13-24.
[22]
Pasquier, N., Bastide, Y., Taouil, R. and Lakhal, L. (1999) Discovering frequent closed itemsets for association rules. Lecture Notes in Computer Science, 1540, pp. 398-416.
[23]
Perng, C. S., Wang, H., Ma, S. and Hellerstein, J. L. (2002) Discovery in multi-attribute data with user-defined constraints. ACM SIGKDD Explorations, 4, pp. 56-64.
[24]
Raedt, L. D. (2002) A perspective on inductive databases. ACM SIGKDD Explorations, 4, pp. 69-77.
[25]
Sarawagi, S. (2000) User-adaptive exploration of multidimensional data. Proc. of the 26th Int'l Conference on Very Large Databases (VLDB), pp. 307-316.
[26]
Savasere, A., Omiecinski, E. and Navathe, S. (1995) An efficient algorithm for mining association rules in large databases. Proc. of the 21st VLDB Conference, pp. 432-444.
[27]
Srikant, R., Vu, Q. and Agrawal, R. (1997) Mining association rules with item constraints. Proceedings of 1997 ACM KDD, pp. 67-73.
[28]
Thomas, S., Bodagala, S., Alsabti, K. and Ranka, S. (1997) An efficient algorithm for the incremental updation of association rules in large databases. Proceedings of the third ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 263-266.
[29]
Tsur, D., Ullman, J. D., Abiteboul, S., Clifton, C., Motwani, R., Nestorov, S. and Rosenthal, A. (1998) Query flocks: A generalization of association-rule mining. Proceedings of 1998 ACM SIGMOD Int. Conf. Management of Data
[30]
Wang, H. (1998) Zaniolo. User defined aggregates for logical data languages. Proc. of Deductive Databases and Logic Programming, pp. 85-97.

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image Applied Artificial Intelligence
Applied Artificial Intelligence  Volume 21, Issue 4-5
April 2007
226 pages
ISSN:0883-9514
EISSN:1087-6545
Issue’s Table of Contents

Publisher

Taylor & Francis, Inc.

United States

Publication History

Published: 01 April 2007

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 29 Jan 2025

Other Metrics

Citations

Cited By

View all

View Options

View options

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media