skip to main content
10.1145/2517312.2517317acmconferencesArticle/Chapter ViewAbstractPublication PagesccsConference Proceedingsconference-collections
research-article
Open access

What you want is not what you get: predicting sharing policies for text-based content on facebook

Published: 04 November 2013 Publication History

Abstract

As the amount of content users publish on social networking sites rises, so do the danger and costs of inadvertently sharing content with an unintended audience. Studies repeatedly show that users frequently misconfigure their policies or misunderstand the privacy features offered by social networks.
A way to mitigate these problems is to develop automated tools to assist users in correctly setting their policy. This paper explores the viability of one such approach: we examine the extent to which machine learning can be used to deduce users' sharing preferences for content posted on Facebook. To generate data on which to evaluate our approach, we conduct an online survey of Facebook users, gathering their Facebook posts and associated policies, as well as their intended privacy policy for a subset of the posts. We use this data to test the efficacy of several algorithms at predicting policies, and the effects on prediction accuracy of varying the features on which they base their predictions. We find that Facebook's default behavior of assigning to a new post the privacy settings of the preceding one correctly assigns policies for only 67% of posts. The best of the prediction algorithms we tested outperforms this baseline for 80% of participants, with an average accuracy of 81%; this equates to a 45% reduction in the number of posts with misconfigured policies. Further, for those participants (66%) whose implemented policy usually matched their intended policy, our approach predicts the correct privacy settings for 94% of posts.

References

[1]
E. Al-Shaer and H. H. Hamed. Discovery of policy anomalies in distributed firewalls. In Proc. INFOCOM, 2004.
[2]
E. Bakshy, J. M. Hofman, W. A. Mason, and D. J. Watts. Everyone's an influencer: quantifying influence on Twitter. In Proc. ACM International Conference on Web Search and Data Mining, 20
[3]
Y. Bartal, A. J. Mayer, K. Nissim, and A. Wool. Firmato: A novel firewall management toolkit. In Proc. IEEE Symposium on Security and Privacy, 1999.
[4]
L. Bauer, S. Garriss, and M. K. Reiter. Detecting and resolving policy misconfigurations in access-control systems. ACM TISSEC, 14(1), 2011.
[5]
A. L. Berger, S. D. Pietra, and V. J. D. Pietra. A maximum entropy approach to natural language processing. Computational Linguistics, 22(1):39--71, 1996.
[6]
A. Besmer and H. Richter Lipford. Moving beyond untagging: Photo privacy in a tagged world. In Proc. CHI, 2010.
[7]
D. M. Blei, A. Y. Ng, and M. I. Jordan. Latent Dirichlet allocation. Journal of Machine Learning Research, 3:993--1022, 2003.
[8]
E. Chin, A. P. Felt, V. Sekar, and D. Wagner. Measuring user confidence in smartphone security and privacy. In Proc. SOUPS, 2012.
[9]
J. Cohen. Statistical Power Analysis for the Behavioral Sciences. Psychology Press, 1988.
[10]
Daily Mail. Not so sweet 16: Birthday girl goes into hiding after 1,500 Facebook users turn up for her party. http://www.dailymail.co.uk/news/article-1394536, 6 June 2011. accessed 14-Feb-2013.
[11]
T. Das, R. Bhagwan, and P. Naldurg. Baaz: A system for detecting access control misconfigurations. In Proc. USENIX Security Symposium, 2010.
[12]
M. de Sa, V. Navalpakkam, and E. F. Churchill. Mobile advertising: evaluating the effects of animation, user and content relevance. In Proc. CHI, 2013.
[13]
L. Fang and K. LeFevre. Privacy wizards for social networking sites. In Proc. WWW, 2010.
[14]
E. Hayashi, O. Riva, K. Strauss, A. J. B. Brush, and S. Schechter. Goldilocks and the two mobile devices: going beyond all-or-nothing access to a device's applications. In Proc. SOUPS, 2012.
[15]
H. Hu, G.-J. Ahn, and J. Jorgensen. Detecting and resolving privacy conflicts for collaborative data sharing in online social networks. In Proc. ACSAC, 2011.
[16]
T. Jaeger, A. Edwards, and X. Zhang. Policy management using access control spaces. ACM Transactions on Information and System Security, 6(3):327--364, 2003.
[17]
M. Johnson, S. Egelman, and S. M. Bellovin. Facebook and privacy: It's complicated. In Proc. SOUPS, 2012.
[18]
P. F. Klemperer, Y. Liang, M. L. Mazurek, M. Sleeper, B. Ur, L. Bauer, L. F. Cranor, N. Gupta, and M. K. Reiter. Tag, you can see it! Using tags for access control in photo sharing. In Proc. CHI, 2012.
[19]
H. Krasnova, O. Günther, S. Spiekermann, and K. Koroleva. Privacy concerns and identity in online social networks. Identity in the Information Society, 2:39--63, 2009.
[20]
C. P. Lam and D. G. Stork. Evaluating classifiers by means of test data with noisy labels. In Proc. 18th International Joint Conference on Artificial intelligence, 2003.
[21]
F. Le, S. Lee, T. Wong, H. Kim, and D. Newcomb. Detecting network-wide and router-specific misconfigurations through data mining. IEEE/ACM Transactions on Networking, 17(1):66--79, 2009.
[22]
J. Liu, Y. Cao, C.-Y. Lin, Y. Huang, and M. Zhou. Low-quality product review detection in opinion summarization. In Proc. EMNLP-CoNLL, 2007.
[23]
Y. Liu, K. P. Gummadi, B. Krishnamurthy, and A. Mislove. Analyzing Facebook privacy settings: user expectations vs. reality. In Proc. IMC, 2011.
[24]
M. Madden. Privacy management on social media sites. http://pewinternet.org/Reports/2012/Privacy-management-on-social-media.aspx, Feb. 2012. accessed 14-Feb-2013.
[25]
A. Mazzia, K. LeFevre, and E. Adar. The PViz comprehension tool for social network privacy settings. In Proc. SOUPS, 2012.
[26]
B. Pang and L. Lee. Opinion mining and sentiment analysis. Found. Trends Inf. Retr., 2(1--2):1--135, Jan. 2008.
[27]
T. Paul, M. Stopczynski, D. Puscher, M. Volkamer, and T. Strufe. C4PS - helping Facebookers manage their privacy settings. In Proc. 4th international conference on Social Informatics, SocInfo'12, 2012.
[28]
E. Phneah. Japan govt used wrong privacy settings in Google Groups. ZDNet, 11 July 2013. http://www.zdnet.com/japan-govt-used-wrong-privacy-settings-in-google-groups-7000017923/ {accessed 20-Jul-2013}.
[29]
F. Provost. Machine learning from imbalanced data sets 101 (extended abstract). In Proc. AAAI Workshop on Imbalanced Data Sets, 2000.
[30]
K. Puniyani, J. Eisenstein, S. Cohen, and E. P. Xing. Social links from latent topics in Microblogs. In Proc. NAACL HLT Workshop on Computational Linguistics in a World of Social Media, 2010.
[31]
D. Ramage, S. T. Dumais, and D. J. Liebling. Characterizing microblogs with topic models. In Proc. ICWSM, 2010.
[32]
D. Ramage, D. Hall, R. Nallapati, and C. D. Manning. Labeled LDA: A supervised topic model for credit attribution in multi-labeled corpora. In Proc. EMNLP, 2009.
[33]
A. Ritter, C. Cherry, and B. Dolan. Unsupervised modeling of Twitter conversations. In Proc. HLT-NAACL, 2010.
[34]
M. Skeels and J. Grudin. When social networks cross boundaries: a case study of workplace use of Facebook and Linkedin. In Proc. GROUP, 2009.
[35]
A. C. Squicciarini, S. Sundareswaran, D. Lin, and J. Wede. A3P: adaptive policy prediction for shared images over popular content sharing sites. In Proc. Hypertext, 2011.
[36]
J. Staddon, D. Huffaker, L. Brown, and A. Sedley. Are privacy concerns a turn-off? Engagement and privacy in social networks. In Proc. SOUPS, 2012.
[37]
Text Fixer. Common English Words List. http://www.textfixer.com/resources/common-english-words.txt. accessed Feb-14--2013.
[38]
Z. Tufekci. Can you see me now? Audience and disclosure regulation in online social network sites. Bulletin of Science, Technology & Society, 28(1):20--36, 2008.
[39]
Y. Wang, P. G. Leon, K. Scott, X. Chen, A. Acquisti, and L. F. Cranor. Privacy nudges for social media: an exploratory Facebook study. In Proc. WWW, 2013.
[40]
Y. Wang, G. Norcie, S. Komanduri, A. Acquisti, P. G. Leon, and L. F. Cranor. "I regretted the minute I pressed share": a qualitative study of regrets on Facebook. In Proc. SOUPS, 2011.
[41]
J. Watson, A. Besmer, and H. R. Lipford. +Your circles: Sharing behavior on Google+. In Proc. SOUPS, 2012.
[42]
A. L. Young and A. Quan-Haase. Information revelation and Internet privacy concerns on social network sites: a case study of Facebook. In Proc. 4th International Conference on Communities and Technologies, 2009.
[43]
T. Zesch and I. Gurevych. Analysis of the Wikipedia category graph for NLP applications. In Proc. TextGraphs-2 Workshop (NAACL-HLT 2007), 2007.

Cited By

View all

Index Terms

  1. What you want is not what you get: predicting sharing policies for text-based content on facebook

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    AISec '13: Proceedings of the 2013 ACM workshop on Artificial intelligence and security
    November 2013
    116 pages
    ISBN:9781450324885
    DOI:10.1145/2517312
    Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 04 November 2013

    Check for updates

    Author Tags

    1. facebook
    2. machine learning
    3. natural language processing
    4. privacy
    5. social network

    Qualifiers

    • Research-article

    Conference

    CCS'13
    Sponsor:

    Acceptance Rates

    AISec '13 Paper Acceptance Rate 10 of 17 submissions, 59%;
    Overall Acceptance Rate 94 of 231 submissions, 41%

    Upcoming Conference

    CCS '25

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)119
    • Downloads (Last 6 weeks)23
    Reflects downloads up to 27 Jan 2025

    Other Metrics

    Citations

    Cited By

    View all

    View Options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Login options

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media