research-article

Evaluation of Sensitive Data Hiding Techniques for Transaction Databases

Authors:

Christos Makris,

Panagiotis MarkovitsAuthors Info & Claims

SETN '18: Proceedings of the 10th Hellenic Conference on Artificial Intelligence

Article No.: 11, Pages 1 - 8

https://doi.org/10.1145/3200947.3201031

Published: 09 July 2018 Publication History

Abstract

Nowadays, the use of databases for the storage and the utilization of statistical data in real time, with the application of data mining techniques, is a mandatory process for modern organisations and businesses. Transaction databases are one of the domains where this process is frequently applied. By studying the transactions which occur in the scope of their activity, businesses can derive valuable conclusions for the behavior of their customers and the patterns that dominate the market, and as a result they can improve their business strategies or devise new ones. In fact, it is a common practice for organisations to share these data with second parties for their mutual benefit.

Such an initiative though is not risk-free, as these shared data can be used by a business rival in a way to lure the customers of the organisation and result in a loss of profit. It is apparent then that the privacy of such sensitive data has to be protected at all costs. A way to achieve this goal is the application of knowledge hiding techniques, a multitude of which has been developed in recent years. The purpose of this paper is to introduce evaluation metrics for these data hiding algorithms to determine which are the most efficient, depending on the size and difficulty of the problem. The implementation of the evaluating methods was developed in the programming language R. Finally, we present the findings of the evaluation process and we reach a conclusion on the cases for which each metric is recommended.

References

[1]

R. Agrawal and R. Srikant. 2000. Privacy-preserving data mining. SIGMOD Conference, p. 439--450.

Digital Library

[2]

M. Askari, R. Safavi-Naini and K. Barker. 2012. An information theoretic privacy and utility measure for data sanitization mechanisms. Proceedings of the 2nd ACM Conference on Data and Application Security and Privacy (CODASPY 12), p. 283--294.

Digital Library

[3]

M. Atallah, E. Bertino, A. Elmagarmid, M. Ibrahim and V.S. Verykios. 1999. Disclosure limitation of sensitive rules. Proceedings of the 1999 Workshop on Knowledge and Data Engineering Exchange (KDEX 99), p. 45--52.

Digital Library

[4]

Jr. Bayardo and J. Roberto. 1998. Efficiently mining long patterns from databases. Proceedings of the 1998 ACM SIGMOD International Conference on Management of Data (SIGMOD 98), p. 85--93.

Digital Library

[5]

T. Brijs, G. Swinnen, K. Vanhoof and G. Wets. 1999. Using association rules for product assortment decisions: A case study. Proceedings of the 5th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD 99), p. 254--260.

Digital Library

[6]

C. Clifton and D. Marks. 1996. Security and privacy implications of data mining. Proceedings of the 1996 ACM SIGMOD International Conference on Management of Data, p. 15--19.

[7]

Tom Fawcett. 2006. An Introduction to ROC Analysis, Pattern Recognition Letters. 27 (8): 861--874.

Digital Library

[8]

Philip J. Fleming and John J. Wallace. 1986. How not to lie with statistics: the correct way to summarize benchmark results, Communications of the ACM. 29 (3): 218--221.

Digital Library

[9]

Frequent Itemset Mining Dataset Repository. http://fimi.ua.ac.be/data/

[10]

A. Gkoulalas-Divanis and V.S. Verykios. 2006. An integer programming approach for frequent itemset hiding. CIKM, p. 748--757.

Digital Library

[11]

A. Gkoulalas-Divanis and V.S. Verykios. 2009. Exact knowledge hiding through database extension. IEEE Transactions on Knowledge and Data Engineering, Vol. 21 (5): p. 699--713.

Digital Library

[12]

http://students.ceid.upatras.gr/~kagklis/software.html

[13]

https://www.rstudio.com/

[14]

IBM ILOG CPLEX User's Manual v12.6

[15]

T. Johnsten and V.V. Raghavan. 2002. A methodology for hiding Knowledge in databases. Proceedings of the IEEE International Conference on Privacy, Security and Data Mining (CRPIT 14), p. 9--17.

Digital Library

[16]

E. Leloglu., T. Ayav and B. Ergenc. 2014. Coefficient-based exact approach for frequent itemset hiding. eKNOW2014: The 6th International Conference on Information, Process, and Knowledge Management, p. 124--130.

[17]

Jun-Lin Lin and Yung-Wei Cheng. 2009. Privacy preserving itemset mining through noisy items. Expert Systems with Applications, Vol. 36 (3), p. 5711--5717.

Digital Library

[18]

B. W Matthews. 1975. Comparison of the predicted and observed secondary structure of T4 phage lysozyme. Biochimica et Biophysica Acta (BBA) - Protein Structure. 405 (2): 442--451.

[19]

S. Menon, S. Sarkar and S. Mukherjee. 2005. Maximizing accuracy of shared databases when concealing sensitive patterns. INFORMS, Vol. 16 (3): p. 256--270.

Digital Library

[20]

Douglas W. Mitchell. 2004. More on spreads and non-arithmetic means, The Mathematical Gazette. 88: 142--144.

[21]

G. V. Moustakides and V.S. Verykios. 2008. A maxmin approach for hiding frequent itemsets. Data Knowl. Eng., Vol. 65 (1): p. 75--89.

Digital Library

[22]

David M. W Powers. 2011. Evaluation: From Precision, Recall and F-Score to ROC, Informedness, Markedness & Correlation. Journal of Machine Learning Technologies. 2 (1): 37--63.

[23]

Y. Saygin, V.S. Verykios and C. Clifton. 2001. Using unknowns to prevent discovery of association rules. SIGMOD Record, Vol. 30 (4): p. 45--54.

Digital Library

[24]

E. C. Stavropoulos, V. S. Verykios and V. Kagklis. 2015. A Transversal Hypergraph Approach for the Frequent Itemset Hiding Problem. Knowledge and Information Systems.

Digital Library

[25]

X. Sun and P. S. Yu. 2007. Hiding sensitive frequent itemsets by a border-based approach. JCSE, Vol. 1 (1): p. 74--94.

[26]

C. J. Van Rijsbergen. 1979. Information Retrieval 2nd Edition, London, GB; Boston, MA: Butterworth.

Digital Library

[27]

V. S. Verykios, A. K. Elmagarmi, E. Bertino, Y. Saygin and E. Dasseni. 2004. Association rule hiding. IEEE Transactions on Knowledge and Data Engineering, Vol. 16 (4): p. 434--447.

Digital Library

[28]

W. J. Youden. 1950. Index for rating diagnostic tests. Cancer. 3: 32--35.

Cited By

Krasadakis PVerykios VSakkopoulos E(2024)A Survey on Algorithms and Software for the Frequent Itemset Hiding Problem2024 15th International Conference on Information, Intelligence, Systems & Applications (IISA)10.1109/IISA62523.2024.10786657(1-8)Online publication date: 17-Jul-2024
https://doi.org/10.1109/IISA62523.2024.10786657
Krasadakis PFutia GVerykios VSakkopoulos E(2024)An End-to-End Knowledge Graph Solution to the Frequent Itemset Hiding ProblemInformation Sciences10.1016/j.ins.2024.120680(120680)Online publication date: May-2024
https://doi.org/10.1016/j.ins.2024.120680
Krasadakis PSakkopoulos EVerykios V(2021)A Database Reconstruction Approach for the Inverse Frequent Itemset Mining ProblemAdvances in Artificial Intelligence-based Technologies10.1007/978-3-030-80571-5_4(45-58)Online publication date: 3-Oct-2021
https://doi.org/10.1007/978-3-030-80571-5_4

Index Terms

Evaluation of Sensitive Data Hiding Techniques for Transaction Databases
1. General and reference
  1. Cross-computing tools and techniques
    1. Evaluation
2. Security and privacy
  1. Security services
    1. Privacy-preserving protocols

Recommendations

Privacy Preserving Data Mining Techniques: Current Scenario and Future Prospects
ICCCT '12: Proceedings of the 2012 Third International Conference on Computer and Communication Technology

Privacy preserving has originated as an important concern with reference to the success of the data mining. Privacy preserving data mining (PPDM) deals with protecting the privacy of individual data or sensitive knowledge without sacrificing the utility ...
Two new techniques for hiding sensitive itemsets and their empirical evaluation
DaWaK'06: Proceedings of the 8th international conference on Data Warehousing and Knowledge Discovery

Many privacy preserving data mining algorithms attempt to selectively hide what database owners consider as sensitive. Specifically, in the association-rules domain, many of these algorithms are based on item-restriction methods; that is, removing items ...
Fast algorithms for hiding sensitive high-utility itemsets in privacy-preserving utility mining

High-Utility Itemset Mining (HUIM) is an extension of frequent itemset mining, which discovers itemsets yielding a high profit in transaction databases (HUIs). In recent years, a major issue that has arisen is that data publicly published or shared by ...

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences

SETN '18: Proceedings of the 10th Hellenic Conference on Artificial Intelligence

July 2018

339 pages

ISBN:9781450364331

DOI:10.1145/3200947

Copyright © 2018 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

In-Cooperation

EETN: Hellenic Artificial Intelligence Society
UOP: University of Patras
University of Thessaly: University of Thessaly, Volos, Greece

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 09 July 2018

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed limited

Conference

SETN '18

SETN '18: 10th Hellenic Conference on Artificial Intelligence

July 9 - 12, 2018

Patras, Greece

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

3
Total Citations
View Citations
80
Total Downloads

Downloads (Last 12 months)3
Downloads (Last 6 weeks)0

Reflects downloads up to 07 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

Krasadakis PVerykios VSakkopoulos E(2024)A Survey on Algorithms and Software for the Frequent Itemset Hiding Problem2024 15th International Conference on Information, Intelligence, Systems & Applications (IISA)10.1109/IISA62523.2024.10786657(1-8)Online publication date: 17-Jul-2024
https://doi.org/10.1109/IISA62523.2024.10786657
Krasadakis PFutia GVerykios VSakkopoulos E(2024)An End-to-End Knowledge Graph Solution to the Frequent Itemset Hiding ProblemInformation Sciences10.1016/j.ins.2024.120680(120680)Online publication date: May-2024
https://doi.org/10.1016/j.ins.2024.120680
Krasadakis PSakkopoulos EVerykios V(2021)A Database Reconstruction Approach for the Inverse Frequent Itemset Mining ProblemAdvances in Artificial Intelligence-based Technologies10.1007/978-3-030-80571-5_4(45-58)Online publication date: 3-Oct-2021
https://doi.org/10.1007/978-3-030-80571-5_4

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents