skip to main content
10.1145/3313831.3376768acmconferencesArticle/Chapter ViewAbstractPublication PageschiConference Proceedingsconference-collections
research-article

Understanding Privacy-Related Questions on Stack Overflow

Published: 23 April 2020 Publication History

Abstract

We analyse Stack Overflow (SO) to understand challenges and confusions developers face while dealing with privacy-related topics. We apply topic modelling techniques to 1,733 privacy-related questions to identify topics and then qualitatively analyse a random sample of 315 privacy-related questions. Identified topics include privacy policies, privacy concerns, access control, and version changes. Results show that developers do ask SO for support on privacy-related issues. We also find that platforms such as Apple and Google are defining privacy requirements for developers by specifying what "sensitive" information is and what types of information developers need to communicate to users (e.g. privacy policies). We also examine the accepted answers in our sample and find that 28% of them link to official documentation and more than half are answered by SO users without references to any external resources.

Supplementary Material

MP4 File (a639-tahaei-presentation.mp4)

References

[1]
Yasemin Acar, Michael Backes, Sascha Fahl, Simson Garfinkel, Doowon Kim, Michelle L Mazurek, and Christian Stransky. 2017. Comparing the Usability of Cryptographic APIs. In 2017 IEEE Symposium on Security and Privacy (SP). 154--171.
[2]
Yasemin Acar, Michael Backes, Sascha Fahl, Doowon Kim, Michelle L Mazurek, and Christian Stransky. 2016. You Get Where You're Looking for: The Impact of Information Sources on Code Security. In 2016 IEEE Symposium on Security and Privacy (SP). 289--305.
[3]
Yasemin Acar, Michael Backes, Sascha Fahl, Doowon Kim, Michelle L Mazurek, and Christian Stransky. 2017. How Internet Resources Might Be Helping You Develop Faster but Less Securely. IEEE Security Privacy 15, 2 (March 2017), 50--60.
[4]
Alexa. 2019. stackoverflow.com. Retrieved September 2019 from https://www.alexa.com/siteinfo/stackoverflow.com
[5]
Miltiadis Allamanis and Charles Sutton. 2013. Why, when, and what: Analyzing Stack Overflow questions by topic, type, and code. In 2013 10th Working Conference on Mining Software Repositories (MSR). 53--56.
[6]
Le An, Ons Mlouki, Foutse Khomh, and Giuliano Antoniol. 2017. Stack Overflow: A code laundering platform?. In 2017 IEEE 24th International Conference on Software Analysis, Evolution and Reengineering (SANER). 283--293.
[7]
Benjamin Andow, Samin Yaseer Mahmud, Wenyu Wang, Justin Whitaker, William Enck, Bradley Reaves, Kapil Singh, and Tao Xie. 2019. PolicyLint: Investigating Internal Privacy Policy Contradictions on Google Play. In 28th USENIX Security Symposium (USENIX Security 19). USENIX Association, Santa Clara, CA, 585--602. https://www.usenix.org/ conference/usenixsecurity19/presentation/andow
[8]
Apple App Store. 2019. App Review. Retrieved September 2019 from https://developer.apple.com/app-store/review
[9]
Apple Developer Documentation. 2019. Accessing Protected Resources. Retrieved September 2019 from https://developer.apple.com/documentation/uikit/protecting_ the_user_s_privacy/accessing_protected_resources
[10]
Hala Assal and Sonia Chiasson. 2018. Security in the Software Development Lifecycle. In Fourteenth Symposium on Usable Privacy and Security (SOUPS 2018). USENIX Association, Baltimore, MD, 281--296. https://www.usenix.org/conference/soups2018/ presentation/assal
[11]
Hala Assal and Sonia Chiasson. 2019. "Think Secure from the Beginning': A Survey with Software Developers. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems (CHI '19). ACM, NY, NY, USA, Article 289, 13 pages.
[12]
Hala Assal, Sonia Chiasson, and Robert Biddle. 2016. Cesar: Visual representation of source code vulnerabilities. In 2016 IEEE Symposium on Visualization for Cyber Security (VizSec). 1--8.
[13]
Rebecca Balebako and Lorrie Cranor. 2014. Improving App Privacy: Nudging App Developers to Protect User Privacy. IEEE Security Privacy 12, 4 (July 2014), 55--58.
[14]
Rebecca Balebako, Abigail Marsh, Jialiu Lin, Jason I Hong, and Lorrie Cranor. 2014. The privacy and security behaviors of smartphone app developers. In Workshop on Usable Security (USEC'14). Internet Society.
[15]
Derek E. Bambauer. 2013. Privacy versus Security. Journal of Criminal Law and Criminology 103 (2013), 667--684. https://ssrn.com/abstract=2208824
[16]
Anton Barua, Stephen W. Thomas, and Ahmed E. Hassan. 2014. What are developers talking about? An analysis of topics and trends in Stack Overflow. Empirical Software Engineering 19, 3 (Jun 2014), 619--654.
[17]
Kathrin Bednar, Sarah Spiekermann, and Marc Langheinrich. 2019. Engineering Privacy by Design: Are engineers ready to live up to the challenge? The Information Society 35, 3 (2019), 122--142.
[18]
Stefanie Beyer and Martin Pinzger. 2014. A Manual Categorization of Android App Development Issues on Stack Overflow. In 2014 IEEE International Conference on Software Maintenance and Evolution. 531--535.
[19]
David M. Blei, Andrew Y. Ng, and Michael I. Jordan. 2003. Latent Dirichlet Allocation. Journal of Machine Learning Research 3 (March 2003), 993--1022. http://dl.acm.org/citation.cfm?id=944919.944937
[20]
Ann Cavoukian. 2009. Privacy by design: The 7 foundational principles. Information and Privacy Commissioner of Ontario, Canada 5 (2009). https://iab.org/wp-content/IAB-uploads/2011/03/fred_ carter.pdf
[21]
Ann Cavoukian, Scott Taylor, and Martin E. Abrams. 2010. Privacy by Design: essential for organizational accountability and strong business practices. Identity in the Information Society 3, 2 (01 Aug 2010), 405--413.
[22]
Lisa Nguyen Quang Do, Karim Ali, Benjamin Livshits, Eric Bodden, Justin Smith, and Emerson Murphy-Hill. 2017. Just-in-time Static Analysis. In Proceedings of the 26th ACM SIGSOFT International Symposium on Software Testing and Analysis (ISSTA 2017). ACM, NY, NY, USA, 307--317.
[23]
Paul Dourish and Ken Anderson. 2006. Collective Information Practice: Exploring Privacy and Security as Social and Cultural Phenomena. Human-Computer Interaction 21, 3 (2006), 319--342.
[24]
Manuel Egele, David Brumley, Yanick Fratantonio, and Christopher Kruegel. 2013. An Empirical Study of Cryptographic Misuse in Android Applications. In Proceedings of the 2013 ACM SIGSAC Conference on Computer & Communications Security (CCS '13). ACM, NY, NY, USA, 73--84.
[25]
Sascha Fahl, Marian Harbach, Henning Perl, Markus Koetter, and Matthew Smith. 2013. Rethinking SSL Development in an Appified World. In Proceedings of the 2013 ACM SIGSAC Conference on Computer & Communications Security (CCS '13). ACM, NY, NY, USA, 49--60.
[26]
Felix Fischer, Konstantin Böttinger, Huang Xiao, Christian Stransky, Yasemin Acar, Michael Backes, and Sascha Fahl. 2017. Stack Overflow Considered Harmful? The Impact of Copy Paste on Android Application Security. In 2017 IEEE Symposium on Security and Privacy (SP). 121--136.
[27]
Martin Georgiev, Subodh Iyengar, Suman Jana, Rishita Anubhai, Dan Boneh, and Vitaly Shmatikov. 2012. The Most Dangerous Code in the World: Validating SSL Certificates in Non-browser Software. In Proceedings of the 2012 ACM Conference on Computer and Communications Security (CCS '12). ACM, NY, NY, USA, 38--49.
[28]
Google. 2019. Privacy & Terms. Retrieved September 2019 from https://policies.google.com/terms
[29]
Google Developers. 2019. Manifest.permission. Retrieved September 2019 from https://developer.android.com/reference/android/ Manifest.permission.html
[30]
Matthew Green and Matthew Smith. 2016. Developers Are Not the Enemy!: The Need for Usable Security APIs. IEEE Security and Privacy 14, 5 (Sept. 2016), 40--46.
[31]
Daniel Greene and Katie Shilton. 2018. Platform privacies: Governance, collaboration, and the different meanings of "privacy" in iOS and Android development. New Media & Society 20, 4 (2018), 1640--1657.
[32]
Seda Gürses, Carmela Troncoso, and Claudia Diaz. 2011. Engineering privacy by design. Computers, Privacy & Data Protection 14, 3 (2011), 25. https://software.imdea.org/~carmela.troncoso/papers/ Gurses-CPDP11.pdf
[33]
Irit Hadar, Tomer Hasson, Oshrat Ayalon, Eran Toch, Michael Birnhack, Sofia Sherman, and Arod Balissa. 2018. Privacy by designers: software developers' privacy mindset. Empirical Software Engineering 23, 1 (Feb 2018), 259--289.
[34]
Hamza Harkous, Kassem Fawaz, Rémi Lebret, Florian Schaub, Kang G. Shin, and Karl Aberer. 2018. Polisis: Automated Analysis and Presentation of Privacy Policies Using Deep Learning. In 27th USENIX Security Symposium (USENIX Security 18). USENIX Association, Baltimore, MD, 531--548. https://www.usenix.org/conference/usenixsecurity18/ presentation/harkous
[35]
Jaap-Henk Hoepman. 2019. Privacy Design Strategies (The Little Blue Book). Radboud University. https://cs.ru.nl/~jhh/publications/pds-booklet.pdf
[36]
Nasif Imtiaz, Akond Rahman, Effat Farhana, and Laurie Williams. 2019. Challenges with Responding to Static Analysis Tool Alerts. In Proceedings of the 16th International Conference on Mining Software Repositories (MSR '19). IEEE Press, Piscataway, NJ, USA, 245--249.
[37]
Shubham Jain, Janne Lindqvist, and others. 2014. Should I protect you? Understanding developers' behavior to privacy-preserving APIs. In Workshop on Usable Security (USEC'14). Internet Society.
[38]
Brittany Johnson, Yoonki Song, Emerson Murphy-Hill, and Robert Bowdidge. 2013. Why Don't Software Developers Use Static Analysis Tools to Find Bugs?. In Proceedings of the 2013 International Conference on Software Engineering (ICSE '13). IEEE Press, Piscataway, NJ, USA, 672--681.
[39]
Bert-Jaap Koops, Jaap-Henk Hoepman, and Ronald Leenes. 2013. Open-source intelligence and privacy by design. Computer Law & Security Review 29, 6 (2013), 676 -- 688. http://www.sciencedirect.com/science/ article/pii/S0267364913001672
[40]
Jonathan Lazar, Jinjuan Heidi Feng, and Harry Hochheiser. 2017. Chapter 8 - Interviews and focus groups. In Research Methods in Human Computer Interaction (second edition ed.), Jonathan Lazar, Jinjuan Heidi Feng, and Harry Hochheiser (Eds.). Morgan Kaufmann, Boston, 187 -- 228.
[41]
Tianshi Li, Yuvraj Agarwal, and Jason I. Hong. 2018. Coconut: An IDE Plugin for Developing Privacy-Friendly Apps. Proc. ACM Interact. Mob. Wearable Ubiquitous Technol. 2, 4, Article 178 (Dec. 2018), 35 pages.
[42]
Yuanchun Li, Fanglin Chen, Toby Jia-Jun Li, Yao Guo, Gang Huang, Matthew Fredrikson, Yuvraj Agarwal, and Jason I. Hong. 2017. PrivacyStreams: Enabling Transparency in Personal Data Processing for Mobile Apps. Proc. ACM Interact. Mob. Wearable Ubiquitous Technol. 1, 3, Article 76 (Sept. 2017), 26 pages.
[43]
Tamara Lopez, Thein Tun, Arosha Bandara, Mark Levine, Bashar Nuseibeh, and Helen Sharp. 2019. An Anatomy of Security Conversations in Stack Overflow. In Proceedings of the 41st International Conference on Software Engineering: Software Engineering in Society (ICSE-SEIS '10). IEEE Press, Piscataway, NJ, USA, 31--40.
[44]
Tamara Lopez, Thein T. Tun, Arosha Bandara, Mark Levine, Bashar Nuseibeh, and Helen Sharp. 2018. An Investigation of Security Conversations in Stack Overflow: Perceptions of Security and Community Involvement. In Proceedings of the 1st International Workshop on Security Awareness from Design to Deployment (SEAD '18). ACM, NY, NY, USA, 26--32.
[45]
Lena Mamykina, Bella Manoim, Manas Mittal, George Hripcsak, and Björn Hartmann. 2011. Design Lessons from the Fastest Q&A Site in the West. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI '11). ACM, NY, NY, USA, 2857--2866.
[46]
Abraham H. Mhaidli, Yixin Zou, and Florian Schaub. 2019. "We Can't Live Without Them!" App Developers' Adoption of Ad Networks and Their Considerations of Consumer Risks. In Fifteenth Symposium on Usable Privacy and Security (SOUPS 2019). USENIX Association, Santa Clara, CA. https://www.usenix.org/ conference/soups2019/presentation/mhaidli
[47]
Sarah Nadi, Stefan Krüger, Mira Mezini, and Eric Bodden. 2016. Jumping Through Hoops: Why Do Java Developers Struggle with Cryptography APIs?. In 2016 IEEE/ACM 38th International Conference on Software Engineering (ICSE). 935--946.
[48]
Seyed Mehdi Nasehi, Jonathan Sillito, Frank Maurer, and Chris Burns. 2012. What makes a good code example?: A study of programming Q A in StackOverflow. In 2012 28th IEEE International Conference on Software Maintenance (ICSM). 25--34.
[49]
Duc Cuong Nguyen, Dominik Wermke, Yasemin Acar, Michael Backes, Charles Weir, and Sascha Fahl. 2017. A Stitch in Time: Supporting Android Developers in WritingSecure Code. In Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security (CCS '17). ACM, NY, NY, USA, 1065--1077.
[50]
OWASP. 2017. Top 10 - 2017 The ten most critical web application security risks. Technical Report. The OWASP Foundation. https://www.owasp.org/index.php/Category: OWASP_Top_Ten_Project
[51]
Chris Parnin, Christoph Treude, Lars Grammel, and Margaret-Anne Storey. 2012. Crowd documentation: Exploring the coverage and the dynamics of API discussions on Stack Overflow. Georgia Institute of Technology, Tech. Rep 11 (2012). http://citeseerx.ist. psu.edu/viewdoc/summary?doi=10.1.1.371.6263
[52]
Nikhil Patnaik, Joseph Hallett, and Awais Rashid. 2019. Usability Smells: An Analysis of Developers' Struggle With Crypto Libraries. In Fifteenth Symposium on Usable Privacy and Security (SOUPS). USENIX Association. https://www.usenix.org/conference/ soups2019/presentation/patnaik
[53]
Olgierd Pieczul, Simon Foley, and Mary Ellen Zurko. 2017. Developer-centered Security and the Symmetry of Ignorance. In Proceedings of the 2017 New Security Paradigms Workshop (NSPW 2017). ACM, NY, NY, USA, 46--56.
[54]
Chaiyong Ragkhitwetsagul, Jens Krinke, Matheus Paixao, Giuseppe Bianco, and Rocco Oliveto. 2019. Toxic Code Snippets on Stack Overflow. IEEE Transactions on Software Engineering (2019), 1--22.
[55]
Akond Rahman, Asif Partho, Patrick Morrison, and Laurie Williams. 2018. What Questions Do Programmers Ask About Configuration As Code?. In Proceedings of the 4th International Workshop on Rapid Continuous Software Engineering (RCoSE '18). ACM, NY, NY, USA, 16--22.
[56]
Christoffer Rosen and Emad Shihab. 2016. What are mobile developers asking about? A large scale study using stack overflow. Empirical Software Engineering 21, 3 (Jun 2016), 1192--1223.
[57]
Neil Salkind. 2019. Encyclopedia of Research Design. (Sept. 2019).
[58]
Florian Schaub, Rebecca Balebako, and Lorrie Faith Cranor. 2017. Designing Effective Privacy Notices and Controls. IEEE Internet Computing 21, 3 (May 2017), 70--77.
[59]
Awanthika Senarath and Nalin A. G. Arachchilage. 2018. Why Developers Cannot Embed Privacy into Software Systems?: An Empirical Investigation. In Proceedings of the 22Nd International Conference on Evaluation and Assessment in Software Engineering 2018 (EASE'18). ACM, NY, NY, USA, 211--216.
[60]
Katie Shilton and Daniel Greene. 2019. Linking Platforms, Practices, and Developer Ethics: Levers for Privacy Discourse in Mobile Application Development. Journal of Business Ethics 155, 1 (Mar 2019), 131--146.
[61]
H. Jeff Smith, Tamara Dinev, and Heng Xu. 2011. Information Privacy Research: An Interdisciplinary Review. MIS Q. 35, 4 (Dec. 2011), 989--1016. http://dl.acm.org/citation.cfm?id=2208940.2208950
[62]
Daniel J Solove. 2005. A taxonomy of privacy. University of Pennsylvania Law Review 154 (2005), 477--560. https://ssrn.com/abstract=667622
[63]
spaCy. 2019. spaCy - Industrial-strength Natural Language Processing in Python. Retrieved September 2019 from https://spacy.io
[64]
Sarah Spiekermann and Lorrie Faith Cranor. 2009. Engineering Privacy. IEEE Transactions on Software Engineering 35, 1 (Jan 2009), 67--82.
[65]
Stack Exchange. 2019. Stack Exchange Data Explorer. Retrieved September 2019 from https://data.stackexchange.com
[66]
Stack Overflow. 2019a. About. Retrieved September 2019 from https://stackoverflow.com/company
[67]
Stack Overflow. 2019b. Developer Survey Results. Retrieved August 2019 from https://insights.stackoverflow.com/survey/2019
[68]
Stack Overflow. 2019c. What does it mean when an answer is "accepted"? Retrieved September 2019 from https://stackoverflow.com/help/accepted-answer
[69]
Stack Overflow. 2019d. Where Developers Learn, Share, & Build Careers. Retrieved September 2019 from https://stackoverflow.com
[70]
Mohammad Tahaei and Kami Vaniea. 2019. A Survey on Developer-Centred Security. In 2019 IEEE European Symposium on Security and Privacy Workshops (EuroS PW). 129--138.
[71]
Christoph Treude, Ohad Barzilay, and Margaret-Anne Storey. 2011. How Do Programmers Ask and Answer Questions on the Web? (NIER Track). In Proceedings of the 33rd International Conference on Software Engineering (ICSE '11). ACM, NY, NY, USA, 804--807.
[72]
Yung Shin Van Der Sype and Walid Maalej. 2014. On lawful disclosure of personal user data: What should app developers do?. In 2014 IEEE 7th International Workshop on Requirements Engineering and Law (RELAW). 25--34.
[73]
Richmond Y. Wong and Deirdre K. Mulligan. 2019. Bringing Design to the Privacy Table: Broadening "Design" in "Privacy by Design" Through the Lens of HCI. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems (CHI '19). ACM, NY, NY, USA, Article 262, 17 pages.
[74]
Yuhao Wu, Shaowei Wang, Cor-Paul Bezemer, and Katsuro Inoue. 2019. How do developers utilize source code from stack overflow? Empirical Software Engineering 24, 2 (Apr 2019), 637--673.
[75]
Jing Xie, Heather Richter Lipford, and Bill Chu. 2011. Why do programmers make security errors?. In 2011 IEEE Symposium on Visual Languages and Human-Centric Computing (VL/HCC). 161--164.
[76]
Xin-Li Yang, David Lo, Xin Xia, Zhi-Yuan Wan, and Jian-Ling Sun. 2016. What Security Questions Do Developers Ask? A Large-Scale Study of Stack Overflow Posts. Journal of Computer Science and Technology 31, 5 (Sep 2016), 910--924.
[77]
Sebastian Zimmeck, Peter Story, Daniel Smullen, Abhilasha Ravichander, Ziqi Wang, Joel Reidenberg, N. Cameron Russell, and Norman Sadeh. 2019. MAPS: Scaling Privacy Compliance Analysis to a Million Apps. Proceedings on Privacy Enhancing Technologies 2019, 3 (2019), 66 -- 86.

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
CHI '20: Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems
April 2020
10688 pages
ISBN:9781450367080
DOI:10.1145/3313831
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 23 April 2020

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. software developers
  2. stack overflow
  3. usable privacy

Qualifiers

  • Research-article

Conference

CHI '20
Sponsor:

Acceptance Rates

Overall Acceptance Rate 6,199 of 26,314 submissions, 24%

Upcoming Conference

CHI 2025
ACM CHI Conference on Human Factors in Computing Systems
April 26 - May 1, 2025
Yokohama , Japan

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)161
  • Downloads (Last 6 weeks)10
Reflects downloads up to 13 Jan 2025

Other Metrics

Citations

Cited By

View all

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media