skip to main content
10.1145/3200947.3201031acmotherconferencesArticle/Chapter ViewAbstractPublication PagessetnConference Proceedingsconference-collections
research-article

Evaluation of Sensitive Data Hiding Techniques for Transaction Databases

Published: 09 July 2018 Publication History

Abstract

Nowadays, the use of databases for the storage and the utilization of statistical data in real time, with the application of data mining techniques, is a mandatory process for modern organisations and businesses. Transaction databases are one of the domains where this process is frequently applied. By studying the transactions which occur in the scope of their activity, businesses can derive valuable conclusions for the behavior of their customers and the patterns that dominate the market, and as a result they can improve their business strategies or devise new ones. In fact, it is a common practice for organisations to share these data with second parties for their mutual benefit.
Such an initiative though is not risk-free, as these shared data can be used by a business rival in a way to lure the customers of the organisation and result in a loss of profit. It is apparent then that the privacy of such sensitive data has to be protected at all costs. A way to achieve this goal is the application of knowledge hiding techniques, a multitude of which has been developed in recent years. The purpose of this paper is to introduce evaluation metrics for these data hiding algorithms to determine which are the most efficient, depending on the size and difficulty of the problem. The implementation of the evaluating methods was developed in the programming language R. Finally, we present the findings of the evaluation process and we reach a conclusion on the cases for which each metric is recommended.

References

[1]
R. Agrawal and R. Srikant. 2000. Privacy-preserving data mining. SIGMOD Conference, p. 439--450.
[2]
M. Askari, R. Safavi-Naini and K. Barker. 2012. An information theoretic privacy and utility measure for data sanitization mechanisms. Proceedings of the 2nd ACM Conference on Data and Application Security and Privacy (CODASPY 12), p. 283--294.
[3]
M. Atallah, E. Bertino, A. Elmagarmid, M. Ibrahim and V.S. Verykios. 1999. Disclosure limitation of sensitive rules. Proceedings of the 1999 Workshop on Knowledge and Data Engineering Exchange (KDEX 99), p. 45--52.
[4]
Jr. Bayardo and J. Roberto. 1998. Efficiently mining long patterns from databases. Proceedings of the 1998 ACM SIGMOD International Conference on Management of Data (SIGMOD 98), p. 85--93.
[5]
T. Brijs, G. Swinnen, K. Vanhoof and G. Wets. 1999. Using association rules for product assortment decisions: A case study. Proceedings of the 5th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD 99), p. 254--260.
[6]
C. Clifton and D. Marks. 1996. Security and privacy implications of data mining. Proceedings of the 1996 ACM SIGMOD International Conference on Management of Data, p. 15--19.
[7]
Tom Fawcett. 2006. An Introduction to ROC Analysis, Pattern Recognition Letters. 27 (8): 861--874.
[8]
Philip J. Fleming and John J. Wallace. 1986. How not to lie with statistics: the correct way to summarize benchmark results, Communications of the ACM. 29 (3): 218--221.
[9]
Frequent Itemset Mining Dataset Repository. http://fimi.ua.ac.be/data/
[10]
A. Gkoulalas-Divanis and V.S. Verykios. 2006. An integer programming approach for frequent itemset hiding. CIKM, p. 748--757.
[11]
A. Gkoulalas-Divanis and V.S. Verykios. 2009. Exact knowledge hiding through database extension. IEEE Transactions on Knowledge and Data Engineering, Vol. 21 (5): p. 699--713.
[12]
http://students.ceid.upatras.gr/~kagklis/software.html
[13]
https://www.rstudio.com/
[14]
IBM ILOG CPLEX User's Manual v12.6
[15]
T. Johnsten and V.V. Raghavan. 2002. A methodology for hiding Knowledge in databases. Proceedings of the IEEE International Conference on Privacy, Security and Data Mining (CRPIT 14), p. 9--17.
[16]
E. Leloglu., T. Ayav and B. Ergenc. 2014. Coefficient-based exact approach for frequent itemset hiding. eKNOW2014: The 6th International Conference on Information, Process, and Knowledge Management, p. 124--130.
[17]
Jun-Lin Lin and Yung-Wei Cheng. 2009. Privacy preserving itemset mining through noisy items. Expert Systems with Applications, Vol. 36 (3), p. 5711--5717.
[18]
B. W Matthews. 1975. Comparison of the predicted and observed secondary structure of T4 phage lysozyme. Biochimica et Biophysica Acta (BBA) - Protein Structure. 405 (2): 442--451.
[19]
S. Menon, S. Sarkar and S. Mukherjee. 2005. Maximizing accuracy of shared databases when concealing sensitive patterns. INFORMS, Vol. 16 (3): p. 256--270.
[20]
Douglas W. Mitchell. 2004. More on spreads and non-arithmetic means, The Mathematical Gazette. 88: 142--144.
[21]
G. V. Moustakides and V.S. Verykios. 2008. A maxmin approach for hiding frequent itemsets. Data Knowl. Eng., Vol. 65 (1): p. 75--89.
[22]
David M. W Powers. 2011. Evaluation: From Precision, Recall and F-Score to ROC, Informedness, Markedness & Correlation. Journal of Machine Learning Technologies. 2 (1): 37--63.
[23]
Y. Saygin, V.S. Verykios and C. Clifton. 2001. Using unknowns to prevent discovery of association rules. SIGMOD Record, Vol. 30 (4): p. 45--54.
[24]
E. C. Stavropoulos, V. S. Verykios and V. Kagklis. 2015. A Transversal Hypergraph Approach for the Frequent Itemset Hiding Problem. Knowledge and Information Systems.
[25]
X. Sun and P. S. Yu. 2007. Hiding sensitive frequent itemsets by a border-based approach. JCSE, Vol. 1 (1): p. 74--94.
[26]
C. J. Van Rijsbergen. 1979. Information Retrieval 2nd Edition, London, GB; Boston, MA: Butterworth.
[27]
V. S. Verykios, A. K. Elmagarmi, E. Bertino, Y. Saygin and E. Dasseni. 2004. Association rule hiding. IEEE Transactions on Knowledge and Data Engineering, Vol. 16 (4): p. 434--447.
[28]
W. J. Youden. 1950. Index for rating diagnostic tests. Cancer. 3: 32--35.

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences
SETN '18: Proceedings of the 10th Hellenic Conference on Artificial Intelligence
July 2018
339 pages
ISBN:9781450364331
DOI:10.1145/3200947
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

In-Cooperation

  • EETN: Hellenic Artificial Intelligence Society
  • UOP: University of Patras
  • University of Thessaly: University of Thessaly, Volos, Greece

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 09 July 2018

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Evaluation Measures
  2. Privacy Preserving Data Mining
  3. Sensitive Data
  4. Transaction Databases

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Conference

SETN '18

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)3
  • Downloads (Last 6 weeks)0
Reflects downloads up to 07 Jan 2025

Other Metrics

Citations

Cited By

View all

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media