On the appropriate pattern frequentness measure and pattern generation mode: a critical review

Published: 10 June 2019 Publication History


The classic case pattern mining is a fundamental subject in data mining and big data science. The goal of the mining is to find correctly from a given dataset the patterns and their respective intrinsic frequentness. This paper examines two important yet misused instruments, the pattern frequentness measure "support" and the full enumeration pattern generation mode, which cause serious Overfitting thus deviate from the mining goal. A theoretic combined solution for the two critical issues is then proposed. This solution plus the equilibrium condition introduced in this paper forms a set of three fundamental rationality check criteria that every mining approach should observe. As such, the rationality of the mining theory and the reliability of the mining results would be substantially improved from the previous work. These together promise a significant change towards more effective pattern mining.


      IDEAS '19: Proceedings of the 23rd International Database Applications & Engineering Symposium
      June 2019
      Author Tags

      1. data mining
      2. frequentness measure
      3. overfitting
      4. pattern frequency
      5. pattern mining
      6. probability anomaly
      7. selective pattern generation
      8. underfitting


      IDEAS 2019

      Acceptance Rates

      Overall Acceptance Rate 74 of 210 submissions, 35%


