skip to main content
10.5555/645804.669849guideproceedingsArticle/Chapter ViewAbstractPublication PagesConference Proceedingsacm-pubtype
Article

A Genetic Algorithm-Based Solution for the Problem of Small Disjuncts

Published: 13 September 2000 Publication History

Abstract

In essence, small disjuncts are rules covering a small number of examples. Hence, these rules are usually error-prone, which contributes to a decrease in predictive accuracy. The problem is particularly serious because, although each small disjuncts covers few examples, the set of small disjuncts can cover a large number of examples. This paper proposes a solution to the problem of discovering accurate small-disjunct rules based on genetic algorithms. The basic idea of our method is to use a hybrid decision tree / genetic algorithm approach for classification. More precisely, examples belonging to large disjuncts are classified by rules produced by a decision-tree algorithm, while examples belonging to small disjuncts are classified by a new genetic algorithm, particularly designed for discovering small-disjunct rules.

References

[1]
CARVALHO, D.R. and FREITAS, A.A. A hybrid decision tree/genetic algorithm for coping with the problem of small disjuncts in data mining. To appear in Proc. 2000 Genetic and Evolutionary Computation Conf. (GECCO-2000). Las Vegas, NV, USA. July 2000.
[2]
COVER, T.M., THOMAS, J.A. (1991) Elements of Information Theory. John Wiley & Sons.
[3]
DANYLUK, A., P. and PROVOST, F.,J. (1993). Small Disjuncts in Action: Learning to Diagnose Errors in the Local Loop of the Telephone Network, Proc. 10th International Conference Machine Learning, 81-88.
[4]
FREITAS, A.A. (2000) Evolutionary Algorithms. Chapter of forthcoming Handbook of Data Mining and Knowledge Discovery. Oxford University Press, 2000.
[5]
FREITAS, A.A. and LAVINGTON, S.H. (1998) Mining Very Large Databases with Parallel Processing. Kluwer.
[6]
HAND, D.J.(1997) Construction and Assessment of Classification Rules. John Wiley & Sons.
[7]
HOLTE, R.C.; ACKER, L.E. and PORTER, B.W. (1989). Concept Learning and the Problem of Small Disjuncts, Proc. IJCAI - 89, 813-818.
[8]
MICHALEWICZ, Z. (1996) Genetic Algorithms + Data Structures = Evolution Programs. 3rd Ed. Springer-Verlag.
[9]
NAZAR, K. and BRAMER, M.A. (1999) Estimating concept difficulty with cross entropy. In: Bramer, M.A. (Ed.) Knowledge Discovery and Data Mining, 3-31. London: IEE.
[10]
NODA, E.; LOPES, H.S.; FREITAS, A.A. (1999) Discovering interesting prediction rules with a genetic algorithm. Proc. Congress on Evolutionary Comput. (CEC-99), 1322-1329.
[11]
PROVOST, F. and ARONIS, J.M. (1996). Scaling up inductive learning with massive parallelism. Machine Learning 23(1), Apr. 1996, 33-46.
[12]
QUINLAN, J.R. (1993). C4.5: Programs for Machine Learning, Morgan Kaufmann.
[13]
RENDELL, L. and SESHU, R. (1990) Learning hard concepts through constructive induction: framework and rationale. Computational Intelligence 6, 247-270.

Cited By

View all
  1. A Genetic Algorithm-Based Solution for the Problem of Small Disjuncts

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image Guide Proceedings
    PKDD '00: Proceedings of the 4th European Conference on Principles of Data Mining and Knowledge Discovery
    September 2000
    698 pages

    Publisher

    Springer-Verlag

    Berlin, Heidelberg

    Publication History

    Published: 13 September 2000

    Qualifiers

    • Article

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)0
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 25 Jan 2025

    Other Metrics

    Citations

    Cited By

    View all

    View Options

    View options

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media