skip to main content
10.1145/3149572.3149586acmotherconferencesArticle/Chapter ViewAbstractPublication PagesicimeConference Proceedingsconference-collections
research-article

A comparative study on innovative approaches for privacy-preservation in knowledge discovery

Published: 09 October 2017 Publication History

Abstract

Confronting with growing size of data and pressure of extracting useful knowledge in different manners made privacy preserving a crucial subject. This major is even more important especially in big data environment that implements knowledge discovery and data mining for producing beneficial information. Beside the inner importance aspect of privacy of personal data, the efficiency of the approaches of preserving privacy is a special factor. This is because of the overheads that injected by privacy preserving methods in decreasing the accuracy of end results of data mining. Soft computing is a general name of a group of logic based methods that have several usages. Its recent usage is in privacy preserving in big data. In this paper, a comprehensive survey of different regular methods of privacy preserving for KDD and Data Mining presented and then reasons of why soft computing methods can be a substitute for Privacy Preserving in these environments are discussed. Beside the analysis and discussion of merit and shortcomings of approaches, a conceptual framework for state of the art of privacy-preserving represented and provides research gaps and future works.

References

[1]
Fayyad, U. M., Piatetsky-Shapiro, G., Smyth, P. and Uthurusamy, R. Advances in knowledge discovery and data mining. AAAI press Menlo Park, 1996.
[2]
Narwaria, M. and Arya, S. Privacy preserving data mining---'A state of the art'. IEEE, City, 2016.
[3]
Korba, L. Privacy in distributed electronic commerce. IEEE, City, 2002.
[4]
Sairam, M. Sowndarya," Performance Analysis of Clustering Algorithms in Detecting Outliers". International Journal of Computer Science and Information Technologies, 2, 1 (2011), 486--488.
[5]
Chen, H. Intelligence and security informatics: information systems perspective. Decision Support Systems, 41, 3 (2006), 555--559.
[6]
Kantarcioğlu, M., Jin, J. and Clifton, C. When do data mining results violate privacy? ACM, City, 2004.
[7]
Westin, A. Freebies and privacy: What net users think (1999).
[8]
Samet, S. and Miri, A. Privacy-preserving back-propagation and extreme learning machine algorithms. Data & Knowledge Engineering, 79 (2012), 40--61.
[9]
Verykios, V. S., Bertino, E., Fovino, I. N., Provenza, L. P., Saygin, Y. and Theodoridis, Y. State-of-the-art in privacy preserving data mining. ACM Sigmod Record, 33, 1 (2004), 50--57.
[10]
Saxena, V. and Pushkar, S. Fuzzy-Based Privacy Preserving Approach in Centralized Database Environment. Springer, City, 2017.
[11]
Verykios, V. S., Elmagarmid, A. K., Bertino, E., Saygin, Y. and Dasseni, E. Association rule hiding. IEEE Transactions on knowledge and data engineering, 16, 4 (2004), 434--447.
[12]
Malik, M. B., Asger, M., Ali, R. and Sarvar, A. A model for privacy preserving in data mining using Soft Computing techniques. IEEE, City, 2015.
[13]
Agrawal, R. and Srikant, R. Privacy-preserving data mining. ACM, City, 2000.
[14]
Lindell, Y. and Pinkas, B. Privacy preserving data mining. Journal of cryptology, 15, 3 (2002).
[15]
Aldeen, Y. A. A. S., Salleh, M. and Razzaque, M. A. A comprehensive review on privacy preserving data mining. SpringerPlus, 4, 1 (2015), 694.
[16]
Downe-Wamboldt, B. Content analysis: method, applications, and issues. Health care for women international, 13, 3 (1992), 313--321.
[17]
White, M. D. and Marsh, E. E. Content analysis: A flexible methodology. Library trends, 55, 1 (2006), 22--45.
[18]
Kassarjian, H. H. Content analysis in consumer research. Journal of consumer research, 4, 1 (1977), 8--18.
[19]
Malik, M. B., Ghazi, M. A. and Ali, R. Privacy preserving data mining techniques: current scenario and future prospects. IEEE, City, 2012.
[20]
Maimon, O. and Rokach, L. Soft computing for knowledge discovery and data mining. Springer Science & Business Media, 2007.
[21]
Samarati, P. and Sweeney, L. Protecting privacy when disclosing information: k-anonymity and its enforcement through generalization and suppression. Technical report, SRI International, 1998.
[22]
Ren, X. and Yang, J. Research on privacy protection based on K-anonymity. IEEE, City, 2010.
[23]
Samarati, P. Protecting respondents identities in microdata release. IEEE transactions on Knowledge and Data Engineering, 13, 6 (2001), 1010--1027.
[24]
Machanavajjhala, A., Kifer, D., Gehrke, J. and Venkitasubramaniam, M. l-diversity: Privacy beyond k-anonymity. ACM Transactions on Knowledge Discovery from Data (TKDD), 1, 1 (2007), 3.
[25]
Fung, B. C., Wang, K. and Yu, P. S. Top-down specialization for information and privacy preservation. IEEE, City, 2005.
[26]
Aggarwal, G., Feder, T., Kenthapadi, K., Motwani, R., Panigrahy, R., Thomas, D. and Zhu, A. Anonymizing tables. Springer, City, 2005.
[27]
Wong, R. C.-W., Li, J., Fu, A. W.-C. and Wang, K. (α, k)-anonymity: an enhanced k-anonymity model for privacy preserving data publishing. ACM, City, 2006.
[28]
Domingo-Ferrer, J. and Torra, V. A critique of k-anonymity and some of its enhancements. IEEE, City, 2008.
[29]
El-Rashidy, M. A., Taha, T. E., Ayad, N. M. and Sroor, H. S. An Effective K-Anonymity Clustering Method for Minimize Data Privacy Preservation Effec-tiveness in Data Mining Results (2010).
[30]
Lin, J.-L., Wei, M.-C., Li, C.-W. and Hsieh, K.-C. A hybrid method for k-anonymization. IEEE, City, 2008.
[31]
Xiong, P. and Zhu, T. An anonymization method based on tradeoff between utility and privacy for data publishing. IEEE, City, 2012.
[32]
Shah, A. and Gulati, R. Evaluating applicability of perturbation techniques for privacy preserving data mining by descriptive statistics. IEEE, City, 2016.
[33]
Datta, S. On random additive perturbation for privacy preserving data mining. University of Maryland, 2004.
[34]
Liu, K., Kargupta, H. and Ryan, J. Random projection-based multiplicative data perturbation for privacy preserving distributed data mining. IEEE Transactions on knowledge and Data Engineering, 18, 1 (2006), 92--106.
[35]
Jahan, T., Narasimha, G. and Rao, C. G. A Comparative Study of Data Perturbation Using Fuzzy Logic to Preserve Privacy. Springer, City, 2014.
[36]
Kargupta, H., Datta, S., Wang, Q. and Sivakumar, K. On the privacy preserving properties of random data perturbation techniques. IEEE, City, 2003.
[37]
Kargupta, H., Datta, S., Wang, Q. and Sivakumar, K. Random-data perturbation techniques and privacy-preserving data mining. Knowledge and Information Systems, 7, 4 (2005), 387--414.
[38]
Shah, A. K. and Gulati, R. Privacy-Leveled Perturbation Model for Privacy Preserving Collaborative Data Mining. Springer, City, 2016.
[39]
Aggarwal, C. C. and Philip, S. Y. A general survey of privacy-preserving data mining models and algorithms. Springer, City, 2008.
[40]
Liu, L., Kantarcioglu, M. and Thuraisingham, B. The applicability of the perturbation model-based privacy preserving data mining for real-world data. IEEE, City, 2006.
[41]
Chen, K. and Liu, L. Geometric data perturbation for privacy preserving outsourced data mining. Knowledge and Information Systems, 29, 3 (2011), 657--695.
[42]
Lohiya, S. and Ragha, L. Privacy preserving in data mining using hybrid approach. IEEE, City, 2012.
[43]
Balasubramaniam, S. and Kavitha, V. Geometric Data Perturbation-Based Personal Health Record Transactions in Cloud Computing. The Scientific World Journal, 2015 (2015).
[44]
Vlachos, M., Schneider, J. and Vassiliadis, V. G. On data publishing with clustering preservation. ACM Transactions on Knowledge Discovery from Data (TKDD), 9, 3 (2015), 23.
[45]
Kadampur, M. A. and Somayajulu, D. V. A Data Perturbation Method by Field Rotation and Binning by Averages Strategy for Privacy Preservation. Springer, City, 2008.
[46]
Wu, Y.-H., Chiang, C.-M. and Chen, A. L. Hiding sensitive association rules with limited side effects. IEEE Transactions on Knowledge and Data engineering, 19, 1 (2007).
[47]
Wang, H. Quality Measurements for Association Rules Hiding. AASRI Procedia, 5 (2013), 228--234.
[48]
Moustakides, G. V. and Verykios, V. S. A MaxMin approach for hiding frequent itemsets. Data & Knowledge Engineering, 65, 1 (2008), 75--89.
[49]
Xu, L., Jiang, C., Wang, J., Yuan, J. and Ren, Y. Information security in big data: privacy and data mining. IEEE Access, 2 (2014), 1149--1176.
[50]
Dasseni, E., Verykios, V. S., Elmagarmid, A. K. and Bertino, E. Hiding association rules by using confidence and support. Springer, City, 2001.
[51]
Jung, K., Park, S., Cho, S. and Park, S. A Novel Privacy Preserving Association Rule Mining using Hadoop. City, 2014.
[52]
Verykios, V. S. and Gkoulalas-Divanis, A. A survey of association rule hiding methods for privacy. Privacy-Preserving Data Mining (2008), 267--289.
[53]
Afzali, G. A. and Mohammadi, S. Privacy Preserving Big Data Mining: Association Rule Hiding. Information Systems & Telecommunication (2016), 70.
[54]
Chandrakar, I., Rani, Y. U., Manasa, M. and Renuka, K. Hybrid algorithm for privacy preserving association rule mining. Journal of Computer Science, 6, 12 (2010), 1494.
[55]
Jain, D., Khatri, P., Soni, R. and Chaurasia, B. K. Hiding sensitive association rules without altering the support of sensitive item (s). Advances in Computer Science and Information Technology. Networks and Communications (2012), 500--509.
[56]
Rao, K. S., Mandhala, V. N., Bhattacharyya, D. and Kim, T.-h. An Association Rule hiding Algorithm for Privacy Preserving Data Mining. International Journal of Control and Automation, 7, 10 (2014), 393--404.
[57]
Yang, Z., Zhong, S. and Wright, R. N. Privacy-preserving classification of customer data without loss of accuracy. SIAM, City, 2005.
[58]
Zhang, N., Wang, S. and Zhao, W. A new scheme on privacy-preserving data classification. ACM, City, 2005.
[59]
Zhang, L., Liu, Y., Wang, R., Fu, X. and Lin, Q. Efficient privacy-preserving classification construction model with differential privacy technology. Journal of Systems Engineering and Electronics, 28, 1 (2017), 170--178.
[60]
Delis, A., Verykios, V. S. and Tsitsonis, A. A. A data perturbation approach to sensitive classification rule hiding. ACM, City, 2010.
[61]
Gupta, S., Im, H. G. and Valorani, M. Classification of ignition regimes in HCCI combustion using computational singular perturbation. Proceedings of the Combustion Institute, 33, 2 (2011), 2991--2999.
[62]
Patel, K., Patel, H. and Patel, P. Privacy Preserving in Data stream classification using different proposed Perturbation Methods. IJEDR, City, 2014.
[63]
Fung, B. C., Wang, K. and Philip, S. Y. Anonymizing classification data for privacy preservation. IEEE transactions on knowledge and data engineering, 19, 5 (2007).
[64]
Yu, H., Vaidya, J. and Jiang, X. Privacy-preserving svm classification on vertically partitioned data. Springer, City, 2006.
[65]
Tripathy, A., Dansana, J. and Mishra, R. A classification based framework for privacy preserving data mining. ACM, City, 2012.
[66]
Raghuram, B. and Gyani, J. Privacy preserving associative classification on vertically partitioned databases. IEEE, City, 2012.
[67]
Jia, Q., Guo, L., Jin, Z. and Fang, Y. Privacy-Preserving Data Classification and Similarity Evaluation for Distributed Systems. IEEE, City, 2016.
[68]
Kumari, V. V., Rao, S. S., Raju, K., Ramana, K. and Avadhani, B. Fuzzy based approach for privacy preserving publication of data. International Journal of Computer Science and Network Security, 8, 1 (2008), 115--121.
[69]
Poovammal, E. and Ponnavaikko, M. Preserving micro data release: Categorical and numerical data. 2009 IEEE SETIT (2009), 5.
[70]
Mukkamala, R. and Ashok, V. G. Fuzzy-based methods for privacy-preserving data mining. IEEE, City, 2011.
[71]
Jahan, T., Narsimha, G. and Rao, C. Multiplicative Data Perturbation Using Fuzzy Logic in Preserving Privacy. ACM, City, 2016.
[72]
Dhanalakshmi, S., Samath, J. A. and Ahmed, M. I. Model Framework for Sensitive Data Preservation Using Fuzzy. Asian Journal of Information Technology, 15, 19 (2016), 3708--3711.
[73]
Karthikeyan, B., Manikandan, G. and Vaithiyanathan, V. A fuzzy based approach for privacy preserving clustering. Journal of Theoretical and applied information Technology, 32, 2 (2011), 118--122.
[74]
Torra, V., Miyamoto, S., Endo, Y. and Domingo-Ferrer, J. On intuitionistic fuzzy clustering for its application to privacy. IEEE, City, 2008.
[75]
Kasugai, H., Kawano, A., Honda, K. and Notsu, A. A study on applicability of fuzzy k-member clustering to privacy preserving pattern recognition. IEEE, City, 2013.
[76]
Kumar, P., Varma, K. I. and Sureka, A. Fuzzy based clustering algorithm for privacy preserving data mining. International Journal of Business Information Systems, 7, 1 (2011), 27--40.
[77]
Honda, K., Kawano, A., Notsu, A. and Ichihashi, H. A fuzzy variant of k-member clustering for collaborative filtering with data anonymization. IEEE, City, 2012.
[78]
Zadeh, L. A. Fuzzy logic, neural networks, and soft computing. Communications of the ACM, 37, 3 (1994), 77--85.
[79]
Wimmer, H. and Powell, L. A Comparison of the Effects of K-Anonymity on Machine Learning Algorithms. City, 2014.
[80]
Malik, M. B., Asger, M., Ali, R. and Arif, T. Preserving Privacy and Optimizing Neural Network Classification by using a Mix of Soft Computing Techniques. International Journal of Computer Applications, 147, 10 (2016).
[81]
Goldberg, D. E. and Holland, J. H. Genetic algorithms and machine learning. Machine learning, 3, 2 (1988), 95--99.
[82]
Dehkordi, M. N., Badie, K. and Zadeh, A. K. A novel method for privacy preserving in association rule mining based on genetic algorithms. Journal of software, 4, 6 (2009), 555--562.
[83]
Matatov, N., Rokach, L. and Maimon, O. Privacy-preserving data mining: A feature set partitioning approach. Information Sciences, 180, 14 (2010), 2696--2720.
[84]
Keshavamurthy, B. N., Khan, A. M. and Toshniwal, D. Privacy preserving association rule mining over distributed databases using genetic algorithm. Neural Computing and Applications, 22, 1 (2013), 351--364.
[85]
Mandapati, S., Bhogapathi, R. B. and Chekka, R. B. A hybrid algorithm for privacy preserving in data mining. International Journal of Intelligent Systems and Applications, 5, 8 (2013), 47.
[86]
Zhu, D., Li, X.-B. and Wu, S. Identity disclosure protection: A data reconstruction approach for privacy-preserving data mining. Decision Support Systems, 48, 1 (2009), 133--140.
[87]
Hong, T.-P., Yang, K.-T., Lin, C.-W. and Wang, S.-L. Evolutionary privacy-preserving data mining. IEEE, City, 2010.

Cited By

View all

Index Terms

  1. A comparative study on innovative approaches for privacy-preservation in knowledge discovery

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Other conferences
    ICIME 2017: Proceedings of the 9th International Conference on Information Management and Engineering
    October 2017
    233 pages
    ISBN:9781450353373
    DOI:10.1145/3149572
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    In-Cooperation

    • University of Salford: University of Salford

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 09 October 2017

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Big data
    2. Data Mining
    3. KDD
    4. PPDM
    5. Privacy preservation
    6. Soft Computing

    Qualifiers

    • Research-article
    • Research
    • Refereed limited

    Conference

    ICIME 2017

    Acceptance Rates

    Overall Acceptance Rate 19 of 31 submissions, 61%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)4
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 16 Jan 2025

    Other Metrics

    Citations

    Cited By

    View all

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media