skip to main content
research-article

Mining High Utility Itemsets with Hill Climbing and Simulated Annealing

Published: 05 October 2021 Publication History

Abstract

High utility itemset mining (HUIM) is the task of finding all items set, purchased together, that generate a high profit in a transaction database. In the past, several algorithms have been developed to mine high utility itemsets (HUIs). However, most of them cannot properly handle the exponential search space while finding HUIs when the size of the database and total number of items increases. Recently, evolutionary and heuristic algorithms were designed to mine HUIs, which provided considerable performance improvement. However, they can still have a long runtime and some may miss many HUIs. To address this problem, this article proposes two algorithms for HUIM based on Hill Climbing (HUIM-HC) and Simulated Annealing (HUIM-SA). Both algorithms transform the input database into a bitmap for efficient utility computation and for search space pruning. To improve population diversity, HUIs discovered by evolution are used as target values for the next population instead of keeping the current optimal values in the next population. Through experiments on real-life datasets, it was found that the proposed algorithms are faster than state-of-the-art heuristic and evolutionary HUIM algorithms, that HUIM-SA discovers similar HUIs, and that HUIM-SA evolves linearly with the number of iterations.

References

[1]
P. Fournier-Viger, J. C. W. Lin, R. U. Kiran, Y. S. Koh, and R. Thomas. 2017. A survey of sequential pattern mining. Data Sci. Patt. Recog. 1, 1 (2017), 54–77.
[2]
J. M. Luna, P. Fournier-Viger, and S. Ventura. 2019. Frequent itemset mining: A 25 years review. Wiley Interdiscip. Rev. Data Min. Knowl. Discov. 9, 6 (2019), e1329.
[3]
C. Zhang and S. Zhang. 2002. Association Rule Mining, Models and Algorithms. Springer.
[4]
P. Fournier-Viger, J. C. W. Lin, T. Truong-Chi, and R. Nkambou. 2019. A. Survey of high utility itemset mining. In High-Utility Pattern Mining: Theory, Algorithms and Applications. Springer, 1–45.
[5]
L. Ni, W. Luo, N. Lu, and W. Zhu. 2020. Mining the local dependency itemset in a products network. ACM Trans. Manag. Inf. Syst. 11, 1 (2020), Article 3, 31 pages.
[6]
Mo Zihayat, H. Davoudi, and A. An2016. Top-k utility-based gene regulation sequential pattern discovery. In Proceedings of the International Conference on Bioinformatics and Biomedicine. 266–273.
[7]
B. E. Shie, J. H. Cheng, K. T. Chuang, and V. S. Tseng. 2012. A. One-phase method for mining high utility mobile sequential patterns in mobile commerce environments. In Proceedings of the International Conference on Industrial Engineering and Other Applications of Applied Intelligent Systems. 616–626.
[8]
W. Gan, J. C. W. Lin, H. C. Chao, P. Fournier-Viger, X. Wang, and P. S. Yu. 2020.Utility-driven mining of trend information for intelligent system. ACM Trans Manag. Inf. Syst. 11, 3 (2020), Article 14, 28 pages.
[9]
Y. Liu, W. Liao, and A. N. Choudhary. 2005. A two-phase algorithm for fast discovery of high utility itemsets. In Proceedings of the Pacific-Asia Conference on Knowledge Discovery and Data Mining. 689–695.
[10]
C. F. Ahmed, S. K. Tanbeer, B. Jeong, and Y. Lee. 2009. Efficient tree structures for high utility pattern mining in incremental databases. IEEE Trans. Knowl. Data. Eng. 21, 12 (2009), 1708–1721.
[11]
V. S. Tseng, C. Wu, B. Shie, and P. S. Yu.2010. UP-Growth: An efficient algorithm for high utility itemset mining. In Proceedings of the International Conference on Knowledge Discovery and Data Mining. 253–262.
[12]
V. S. Tseng, C. Wu, P. Fournier-Viger, and P. S. Yu.2016. Efficient algorithms for mining top-k high utility itemsets. IEEE Trans. Knowl. Data. Eng. 28, 1 (2016), 54–67.
[13]
P. Fournier-Viger, C. Wu, S. Zida, and V. S. Tseng. 2014. FHM: Faster high-utility itemset mining using estimated utility co-occurrence pruning. In Proceedings of the International Symposium on Foundations of Intelligent Systems. 83–92.
[14]
S. Zida, P. Fournier-Viger, J. C. W. Lin, C. Wu, and V. S. Tseng. 2015. EFIM: A highly efficient algorithm for high-utility itemset mining. In Proceedings of the Mexican International Conference on Artificial Intelligence. 530–546.
[15]
S. Ventura and J. M. Luna. 2016. Pattern Mining with Evolutionary Algorithms. Springer.
[16]
J. M. Luna, M. Pechenizkiy, M. J. del Jesus, and S. Ventura. 2017. Mining context-aware association rules using grammar-based genetic programming. IEEE Trans. Cyber. 48, 11 (2017), 3030–3044.
[17]
X. Yu and M. Gen2010. Introduction to Evolutionary Algorithms. Springer.
[18]
S. Kannimuthu and K. Premalatha. 2014. Discovery of high utility itemsets using genetic algorithm with ranked mutation. Appl. Artif. Intell. 28, 4 (2014), 337–359.
[19]
J. C. W. Lin, L. Yang, P. Fournier-Viger, T. Hong, and M. Voznak. 2017. A binary PSO approach to mine high-utility itemsets. Soft Comput. 21, 17 (2017), 5103–5121.
[20]
J. C. W. Lin, L. Yang, P. Fournier-Viger, J. M. Wu, T. Hong, S. L. Wang, and J. Zhan. 2016. Mining high-utility itemsets based on particle swarm optimization. Eng. Appl. Artif. Intell. 55 (2016), 320–330.
[21]
K. E. Heraguemi, N. Kamel, and H. Drias. 2014. Association rule mining based on Bat algorithm. In Proceedings of the International Conference on Bio-Inspired Computing—Theories and Applications. 182–186.
[22]
K. E. Heraguemi, N. Kamel, and H. Drias. 2016. Multi-swarm Bat algorithm for association rule mining using multiple cooperative strategies. Appl. Intell. 45, 4 (2016), 1021–1033.
[23]
W. Song and C. Huang. 2018. Mining high utility itemsets using bio-inspired algorithms: A diverse optimal value framework. IEEE Access 6 (2018), 19568–19582.
[24]
W. Song and C. Huang. 2018. Discovering high utility itemsets based on the artificial bee colony algorithm. In Proceedings of the Pacific-Asia Conference on Knowledge Discovery and Data Mining. 3–14.
[25]
S. J. Russell and P. Norvig. 2010. Artificial Intelligence: A Modern Approach (3rd ed.). Pearson Education.
[26]
S. Kirkpatrick, C. D. Gelatt, and M. P. Vecchi. 1983. Optimization by simulated annealing. Science 220, 4598 (1983), 671–680.
[27]
R. Agrawal and R. Srikant. 1994. Fast algorithms for mining association rules. In Proceedings of the International Conference on Very Large Data Bases. 487–499.
[28]
R. Chan R, Q. Yang, and Y. D. Shen. 2003. Mining high utility itemsets. In Proceedings of the International Conference on Data Mining. 19–26.
[29]
W. Song, Y. Liu, and J. Li.2014. BAHUI: Fast and memory efficient mining of high utility itemsets based on bitmap. Int. J. Data Warehouse Min. 10, 1 (2014), 1–15.
[30]
W. Song, Y. Liu, and J. Li.2014. BAHUI: Fast and memory efficient mining of high utility itemsets based on bitmap. Int. J. Data Warehouse Min. 10, 1 (2014), 1–15.
[31]
S. Bagui and P. Stanley. 2020. Mining frequent itemsets from streaming transaction data using genetic algorithms. J. Big Data 7, 1 (3030), 54.
[32]
Y. Djenouri, D. Djenouri, A. Belhadi, P. Fournier-Viger, and J. C. W. Lin. 2018.A new framework for meta heuristic-based frequent itemset mining. Appl Intell. 48, 12 (2018), 4775–4791.
[33]
D. Martín, J. Alcalá-Fdez, A. Rosete, and F. Herrera. 2016. NICGAR: A niching genetic algorithm to mine a diverse set of interesting quantitative association rules. Inf. Sci. 355-356 (2016), 208–228.
[34]
E. Alatas and E. Akin. 2006. An efficient genetic algorithm for automated mining of both positive and negative quantitative association rules. Soft Comput. 10, 3 (2006), 230–237.
[35]
J. Alcala-Fdez, N. F. Pape, A. Bonarini, and F. Herrera. 2010. Analysis of the effectiveness of the genetic algorithms based on extraction of association rules. Fundam. Inform. 98, 1 (2010), 1–14.
[36]
S. Dehuri, S. Patnaik, A. Ghosh, and R. R. Mall. 2008. Application of elitist multi-objective genetic algorithm for classification rule generation. Appl. Soft Comput. 8, 1 (2008), 477–487.
[37]
P. P. Wakabi-Waiswa, V. Baryamureeba, and K. Sarukesi. 2011. Optimized association rule mining with genetic algorithms. In Proceedings of the International Conference on Natural Computation. 1116–1120.
[38]
X. Yan, X. Zhang, and X. Zhang. 2009. Genetic algorithm-based strategy for identifying association rules without specifying actual minimum support. Expert Syst. Appl. 36, 2 (2009), 3066–3076.
[39]
R. Pears and K. S. Koh. 2011. Weighted association rule mining using particle swarm optimization. In Proceedings of the International Workshop on New Frontiers in Applied Data Mining. 327–338.
[40]
J. Gou, F. Wang, and W. Luo. 2015. Mining fuzzy association rules based on parallel particle swarm optimization algorithm. Intell. Autom. Soft. Comput. 21, 2 (2015), 147–162.
[41]
Q. Zhang, W. Fang, J. Sun, and Q. Wang. 2019. Improved genetic algorithm for high-utility itemset mining. IEEE Access 7 (2019), 176799–176813.
[42]
J. M. T. Wu, J. Zhan, and J. C. W. Lin. 2017. An ACO-based approach to mine high-utility itemsets. Knowl. Based Syst. 116 (2017), 102–113.
[43]
W. Song and C. Huang. 2020. Mining high average-utility itemsets based on particle swarm optimization. Data Sci. Patt. Recog. 4, 2 (2020), 19–32.
[44]
N. Pazhaniraja, S. Sountharrajan, and B. S. Kumar. 2020. High utility itemset mining: A Boolean operators-based modified grey wolf optimization algorithm. Soft Comput. 24, 21 (2020), 16691–16704.
[45]
H. Yao, H. J. Hamilton, and C. J. Butz. 2004. A foundational approach to mining itemset utilities from databases. In Proceedings of the SIAM International Conference on Data Mining. 482–486.
[46]
H. Yao and H. J. Hamilton. 2006. Mining itemset utilities from transaction databases. Data Knowl. Eng. 59, 3 (2006), 603–626.
[47]
M. S. Nawaz, M. Z. Nawaz, O. Hasan, P. Fournier-Viger, and M. Sun. 2021. An evolutionary/heuristic-based proof searching framework for interactive theorem prover. Appl. Soft Comput. 104 (2021), 107200.
[48]
P. Fournier-Viger, J. C. W. Lin, A. Gomariz, T. Gueniche, A. Soltani, Z. Deng, and T. H. Lam. 2016. The SPMF open-source data mining library version 2. In Proceedings of the European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases. 36–40.
[49]
T. Jones. 1995. Crossover, macromutation, and population-based search. In Proceedings of the International Conference on Genetic Algorithms. 73–80.
[50]
M. S. Nawaz, P. Fournier-Viger, W. Song, J. C. W. Lin, and B. Noack2021. Investigating crossover operators in genetic algorithms for high-utility itemset mining. In Proceedings of the Asian Conference on Intelligent Information and Database Systems. 16–28.
[51]
J. Grobler and A. P. Engelbrecht. 2016. Headless chicken particle swarm optimization algorithms. In Proceedings of the International Conference on Swarm Intelligence. 350–357.
[52]
T. P. Hong, C. H. Lee, and S. L. Wang. 2009. Mining high average-utility itemsets. In Proceedings of the International Conference on Systems, Man, and Cybernetics. 2526–2530.

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Management Information Systems
ACM Transactions on Management Information Systems  Volume 13, Issue 1
March 2022
203 pages
ISSN:2158-656X
EISSN:2158-6578
DOI:10.1145/3483343
Issue’s Table of Contents

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 05 October 2021
Accepted: 01 April 2021
Revised: 01 March 2021
Received: 01 November 2020
Published in TMIS Volume 13, Issue 1

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Hill climbing
  2. simulated annealing
  3. high utility itemsets
  4. bitmap
  5. neighbor

Qualifiers

  • Research-article
  • Refereed

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)67
  • Downloads (Last 6 weeks)13
Reflects downloads up to 20 Jan 2025

Other Metrics

Citations

Cited By

View all

View Options

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Full Text

View this article in Full Text.

Full Text

HTML Format

View this article in HTML Format.

HTML Format

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media