research-article

An Automated Social Graph De-anonymization Technique

Authors:

George DanezisAuthors Info & Claims

WPES '14: Proceedings of the 13th Workshop on Privacy in the Electronic Society

Pages 47 - 58

https://doi.org/10.1145/2665943.2665960

Published: 03 November 2014 Publication History

Abstract

We present a generic and automated approach to re-identifying nodes in anonymized social networks which enables novel anonymization techniques to be quickly evaluated. It uses machine learning (decision forests) to matching pairs of nodes in disparate anonymized sub-graphs. The technique uncovers artefacts and invariants of any black-box anonymization scheme from a small set of examples. Despite a high degree of automation, classification succeeds with significant true positive rates even when small false positive rates are sought. Our evaluation uses publicly available real world datasets to study the performance of our approach against real-world anonymization strategies, namely the schemes used to protect datasets of The Data for Development (D4D) Challenge. We show that the technique is effective even when only small numbers of samples are used for training. Further, since it detects weaknesses in the black-box anonymization scheme it can re-identify nodes in one social network when trained on another.

References

[1]

L. Backstrom, C. Dwork, and J. Kleinberg. Wherefore art thou r3579x?: anonymized social networks, hidden patterns, and structural steganography. In Proceedings of the 16th international conference on World Wide Web, pages 181--190. ACM, 2007.

Digital Library

[2]

V. D. Blondel, M. Esch, C. Chan, F. Clérot, P. Deville, E. Huens, F. Morlot, Z. Smoreda, and C. Ziemlicki. Data for development: the D4D challenge on mobile phone data. CoRR, abs/1210.0137, 2012.

[3]

L. Breiman. Random forests. Machine learning, 45(1):5--32, 2001.

Digital Library

[4]

L. Breiman, J. H. Friedman, R. A. Olshen, and C. J. Stone. Classification and regression trees. Wadsworth & Brooks. Monterey, CA, 1984.

[5]

A. Criminisi, J. Shotton, and E. Konukoglu. Decision forests for classification, regression, density estimation, manifold learning and semi-supervised learning. Technical Report MSR-TR-2011--114, Microsoft Research, Oct 2011.

Digital Library

[6]

A. Criminisi, J. Shotton, D. Robertson, and E. Konukoglu. Regression forests for efficient anatomy detection and localization in CT studies. In Medical Computer Vision. Recognition Techniques and Applications in Medical Imaging, pages 106--117. Springer, 2011.

[7]

C. Dwork and M. Naor. On the difficulties of disclosure prevention in statistical databases or the case for differential privacy. Journal of Privacy and Confidentiality, 2(1):8, 2008.

[8]

M. Hay, G. Miklau, D. Jensen, D. Towsley, and P. Weis. Resisting structural re-identification in anonymized social networks. Proc. VLDB Endow., 1(1):102--114, Aug. 2008.

Digital Library

[9]

T. K. Ho. Random decision forests. In Document Analysis and Recognition, 1995., Proceedings of the Third International Conference on, volume 1, pages 278--282. IEEE, 1995.

Digital Library

[10]

T. K. Ho. The random subspace method for constructing decision forests. Pattern Analysis and Machine Intelligence, IEEE Transactions on, 20(8):832--844, 1998.

Digital Library

[11]

D. Kifer and A. Machanavajjhala. No free lunch in data privacy. In Proceedings of the 2011 ACM SIGMOD International Conference on Management of Data, SIGMOD '11, pages 193--204, New York, NY, USA, 2011. ACM.

Digital Library

[12]

A. Mislove, M. Marcon, K. P. Gummadi, P. Druschel, and B. Bhattacharjee. Measurement and analysis of online social networks. In IMC '07: Proceedings of the 7th ACM SIGCOMM conference on Internet measurement, pages 29--42, New York, NY, USA, 2007. ACM.

Digital Library

[13]

Y. D. Mulder, G. Danezis, L. Batina, and B. Preneel. Identification via location-profiling in GSM networks. In V. Atluri and M. Winslett, editors, WPES, pages 23--32. ACM, 2008.

Digital Library

[14]

A. Narayanan, E. Shi, and B. Rubinstein. Link prediction by de-anonymization: How we won the Kaggle social network challenge. In Neural Networks (IJCNN), The 2011 International Joint Conference on, pages 1825--1834. IEEE, 2011.

[15]

A. Narayanan and V. Shmatikov. Robust de-anonymization of large sparse datasets. In Security and Privacy, 2008. SP 2008. IEEE Symposium on, pages 111--125. IEEE, 2008.

Digital Library

[16]

A. Narayanan and V. Shmatikov. De-anonymizing social networks. In Security and Privacy, 2009 30th IEEE Symposium on, pages 173--187. IEEE, 2009.

Digital Library

[17]

A. Sala, X. Zhao, C. Wilson, H. Zheng, and B. Zhao. Sharing graphs using differentially private graph models. In Proceedings of the 2011 ACM SIGCOMM conference on Internet measurement conference, pages 81--98. ACM, 2011.

Digital Library

[18]

J. Shotton, M. Johnson, and R. Cipolla. Semantic texton forests for image categorization and segmentation. In Computer Vision and Pattern Recognition, 2008. CVPR 2008. IEEE Conference on, pages 1--8. IEEE, 2008.

[19]

G. Wondracek, T. Holz, E. Kirda, and C. Kruegel. A practical attack to de-anonymize social network users. In Security and Privacy (SP), 2010 IEEE Symposium on, pages 223--238. IEEE, 2010.

Digital Library

[20]

P. Yin, A. Criminisi, J. Winn, and M. Essa. Tree-based classifiers for bilayer video segmentation. In Computer Vision and Pattern Recognition, 2007. CVPR'07. IEEE Conference on, pages 1--8. IEEE, 2007.

[21]

R. Zafarani and H. Liu. Social computing data repository at ASU, 2009.

[22]

B. Zhou and J. Pei. Preserving privacy in social networks against neighborhood attacks. In Data Engineering, 2008. ICDE 2008. IEEE 24th International Conference on, pages 506--515. IEEE, 2008.

Digital Library

Cited By

Majeed AKhan SHwang S(2022)A Comprehensive Analysis of Privacy-Preserving Solutions Developed for Online Social NetworksElectronics10.3390/electronics1113193111:13(1931)Online publication date: 21-Jun-2022
https://doi.org/10.3390/electronics11131931
Zhao YWagner I(2022)Using Metrics Suites to Improve the Measurement of Privacy in GraphsIEEE Transactions on Dependable and Secure Computing10.1109/TDSC.2020.298027119:1(259-274)Online publication date: 1-Jan-2022
https://doi.org/10.1109/TDSC.2020.2980271
Creţu AMonti FMarrone SDong XBronstein Mde Montjoye Y(2022)Interaction data are identifiable even across long periods of timeNature Communications10.1038/s41467-021-27714-613:1Online publication date: 25-Jan-2022
https://doi.org/10.1038/s41467-021-27714-6
Show More Cited By

Recommendations

A brief survey on anonymization techniques for privacy preserving publishing of social network data

Nowadays, partly driven by many Web 2.0 applications, more and more social network data has been made publicly available and analyzed in one way or another. Privacy preserving publishing of social network data becomes a more and more important concern. ...
Community-Enhanced De-anonymization of Online Social Networks
CCS '14: Proceedings of the 2014 ACM SIGSAC Conference on Computer and Communications Security

Online social network providers have become treasure troves of information for marketers and researchers. To profit from their data while honoring the privacy of their customers, social networking services share `anonymized' social network datasets, ...
Structural Data De-anonymization: Quantification, Practice, and Implications
CCS '14: Proceedings of the 2014 ACM SIGSAC Conference on Computer and Communications Security

In this paper, we study the quantification, practice, and implications of structural data (e.g., social data, mobility traces) De-Anonymization (DA). First, we address several open problems in structural data DA by quantifying perfect and (1-ε)-perfect ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

WPES '14: Proceedings of the 13th Workshop on Privacy in the Electronic Society

November 2014

218 pages

ISBN:9781450331487

DOI:10.1145/2665943

General Chair:
Gail-Joon Ahn
Arizona State University, USA
,
Program Chair:
Anupam Datta
Carnegie Mellon University, USA

Copyright © 2014 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

SIGSAC: ACM Special Interest Group on Security, Audit, and Control

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 03 November 2014

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

Conference

CCS'14

Sponsor:

SIGSAC

CCS'14: 2014 ACM SIGSAC Conference on Computer and Communications Security

November 3, 2014

Arizona, Scottsdale, USA

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

48
Total Citations
View Citations
353
Total Downloads

Downloads (Last 12 months)20
Downloads (Last 6 weeks)6

Reflects downloads up to 04 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

Majeed AKhan SHwang S(2022)A Comprehensive Analysis of Privacy-Preserving Solutions Developed for Online Social NetworksElectronics10.3390/electronics1113193111:13(1931)Online publication date: 21-Jun-2022
https://doi.org/10.3390/electronics11131931
Zhao YWagner I(2022)Using Metrics Suites to Improve the Measurement of Privacy in GraphsIEEE Transactions on Dependable and Secure Computing10.1109/TDSC.2020.298027119:1(259-274)Online publication date: 1-Jan-2022
https://doi.org/10.1109/TDSC.2020.2980271
Creţu AMonti FMarrone SDong XBronstein Mde Montjoye Y(2022)Interaction data are identifiable even across long periods of timeNature Communications10.1038/s41467-021-27714-613:1Online publication date: 25-Jan-2022
https://doi.org/10.1038/s41467-021-27714-6
Halimi AAyday EKlamma RO'Halloran SRokne J(2021)Real-time privacy risk quantification in online social networksProceedings of the 2021 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining10.1145/3487351.3488272(74-81)Online publication date: 8-Nov-2021
https://dl.acm.org/doi/10.1145/3487351.3488272
Chicha EBouna BNassar MChbeir RHaraty ROussalah MBenslimane DAlraja M(2021)A User-Centric Mechanism for Sequentially Releasing Graph Datasets under Blowfish PrivacyACM Transactions on Internet Technology10.1145/343150121:1(1-25)Online publication date: 17-Feb-2021
https://dl.acm.org/doi/10.1145/3431501
Cui GHe QChen FJin HXiang YYang Y(2021)Location Privacy Protection via Delocalization in 5G Mobile Edge Computing EnvironmentIEEE Transactions on Services Computing10.1109/TSC.2021.3112659(1-1)Online publication date: 2021
https://doi.org/10.1109/TSC.2021.3112659
Curzon JKosa TAkalu REl-Khatib K(2021)Privacy and Artificial IntelligenceIEEE Transactions on Artificial Intelligence10.1109/TAI.2021.30880842:2(96-108)Online publication date: Apr-2021
https://doi.org/10.1109/TAI.2021.3088084
Almeida RPatrocinio ZAraujo AKijak EMalinowski SGuimaraes S(2021)Descriptive Image Gradient from Edge-Weighted Image Graph and Random Forests2021 34th SIBGRAPI Conference on Graphics, Patterns and Images (SIBGRAPI)10.1109/SIBGRAPI54419.2021.00053(338-345)Online publication date: Oct-2021
https://doi.org/10.1109/SIBGRAPI54419.2021.00053
Azad MArshad JAkmal SRiaz FAbdullah SImran MAhmad F(2021)A First Look at Privacy Analysis of COVID-19 Contact-Tracing Mobile ApplicationsIEEE Internet of Things Journal10.1109/JIOT.2020.30241808:21(15796-15806)Online publication date: 1-Nov-2021
https://doi.org/10.1109/JIOT.2020.3024180
Nakamura YNishi H(2021)Digital Watermarking for Anonymized Data With Low Information LossIEEE Access10.1109/ACCESS.2021.31109849(130570-130585)Online publication date: 2021
https://doi.org/10.1109/ACCESS.2021.3110984
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents