Skip to main content

Showing 1–50 of 109 results for author: De Cristofaro, E

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.13985  [pdf, other

    cs.LG cs.CR

    The Elusive Pursuit of Replicating PATE-GAN: Benchmarking, Auditing, Debugging

    Authors: Georgi Ganev, Meenatchi Sundaram Muthu Selva Annamalai, Emiliano De Cristofaro

    Abstract: Synthetic data created by differentially private (DP) generative models is increasingly used in real-world settings. In this context, PATE-GAN has emerged as a popular algorithm, combining Generative Adversarial Networks (GANs) with the private training approach of PATE (Private Aggregation of Teacher Ensembles). In this paper, we analyze and benchmark six open-source PATE-GAN implementations, inc… ▽ More

    Submitted 20 June, 2024; originally announced June 2024.

  2. arXiv:2405.16682  [pdf, other

    cs.LG cs.CL cs.CR

    A Systematic Review of Federated Generative Models

    Authors: Ashkan Vedadi Gargary, Emiliano De Cristofaro

    Abstract: Federated Learning (FL) has emerged as a solution for distributed systems that allow clients to train models on their data and only share models instead of local data. Generative Models are designed to learn the distribution of a dataset and generate new data samples that are similar to the original data. Many prior works have tried proposing Federated Generative Models. Using Federated Learning a… ▽ More

    Submitted 26 May, 2024; originally announced May 2024.

    Comments: 24 Pages, 3 Figures, 5 Tables

  3. arXiv:2405.14106  [pdf, other

    cs.CR cs.LG

    Nearly Tight Black-Box Auditing of Differentially Private Machine Learning

    Authors: Meenatchi Sundaram Muthu Selva Annamalai, Emiliano De Cristofaro

    Abstract: This paper presents a nearly tight audit of the Differentially Private Stochastic Gradient Descent (DP-SGD) algorithm in the black-box model. Our auditing procedure empirically estimates the privacy leakage from DP-SGD using membership inference attacks; unlike prior work, the estimates are appreciably close to the theoretical DP bounds. The main intuition is to craft worst-case initial model para… ▽ More

    Submitted 22 May, 2024; originally announced May 2024.

  4. arXiv:2405.10994  [pdf, other

    cs.CR

    "What do you want from theory alone?" Experimenting with Tight Auditing of Differentially Private Synthetic Data Generation

    Authors: Meenatchi Sundaram Muthu Selva Annamalai, Georgi Ganev, Emiliano De Cristofaro

    Abstract: Differentially private synthetic data generation (DP-SDG) algorithms are used to release datasets that are structurally and statistically similar to sensitive data while providing formal bounds on the information they leak. However, bugs in algorithms and implementations may cause the actual information leakage to be higher. This prompts the need to verify whether the theoretical guarantees of sta… ▽ More

    Submitted 16 May, 2024; originally announced May 2024.

    Comments: To appear at Usenix Security 2024

  5. arXiv:2405.10233  [pdf, other

    cs.SI cs.CY cs.IR

    iDRAMA-Scored-2024: A Dataset of the Scored Social Media Platform from 2020 to 2023

    Authors: Jay Patel, Pujan Paudel, Emiliano De Cristofaro, Gianluca Stringhini, Jeremy Blackburn

    Abstract: Online web communities often face bans for violating platform policies, encouraging their migration to alternative platforms. This migration, however, can result in increased toxicity and unforeseen consequences on the new platform. In recent years, researchers have collected data from many alternative platforms, indicating coordinated efforts leading to offline events, conspiracy movements, hate… ▽ More

    Submitted 16 May, 2024; originally announced May 2024.

  6. arXiv:2401.13248  [pdf, other

    cs.CY cs.SI

    "Here's Your Evidence": False Consensus in Public Twitter Discussions of COVID-19 Science

    Authors: Alexandros Efstratiou, Marina Efstratiou, Satrio Yudhoatmojo, Jeremy Blackburn, Emiliano De Cristofaro

    Abstract: The COVID-19 pandemic brought about an extraordinary rate of scientific papers on the topic that were discussed among the general public, although often in biased or misinformed ways. In this paper, we present a mixed-methods analysis aimed at examining whether public discussions were commensurate with the scientific consensus on several COVID-19 issues. We estimate scientific consensus based on s… ▽ More

    Submitted 7 June, 2024; v1 submitted 24 January, 2024; originally announced January 2024.

    Comments: Accepted for publication at 27th ACM Conference on Computer Supported Cooperative Work and Social Computing (ACM CSCW 2024). Please cite accordingly

  7. arXiv:2312.08394  [pdf, other

    cs.CR cs.CY cs.SI

    From HODL to MOON: Understanding Community Evolution, Emotional Dynamics, and Price Interplay in the Cryptocurrency Ecosystem

    Authors: Kostantinos Papadamou, Jay Patel, Jeremy Blackburn, Philipp Jovanovic, Emiliano De Cristofaro

    Abstract: This paper presents a large-scale analysis of the cryptocurrency community on Reddit, shedding light on the intricate relationship between the evolution of their activity, emotional dynamics, and price movements. We analyze over 130M posts on 122 cryptocurrency-related subreddits using temporal analysis, statistical modeling, and emotion detection. While /r/CryptoCurrency and /r/dogecoin are the m… ▽ More

    Submitted 12 December, 2023; originally announced December 2023.

  8. arXiv:2312.05114  [pdf, other

    cs.CR cs.AI cs.LG

    On the Inadequacy of Similarity-based Privacy Metrics: Reconstruction Attacks against "Truly Anonymous Synthetic Data''

    Authors: Georgi Ganev, Emiliano De Cristofaro

    Abstract: Training generative models to produce synthetic data is meant to provide a privacy-friendly approach to data release. However, we get robust guarantees only when models are trained to satisfy Differential Privacy (DP). Alas, this is not the standard in industry as many companies use ad-hoc strategies to empirically evaluate privacy based on the statistical similarity between synthetic and real dat… ▽ More

    Submitted 8 December, 2023; originally announced December 2023.

  9. arXiv:2311.16940  [pdf, other

    cs.CR cs.CY

    FP-Fed: Privacy-Preserving Federated Detection of Browser Fingerprinting

    Authors: Meenatchi Sundaram Muthu Selva Annamalai, Igor Bilogrevic, Emiliano De Cristofaro

    Abstract: Browser fingerprinting often provides an attractive alternative to third-party cookies for tracking users across the web. In fact, the increasing restrictions on third-party cookies placed by common web browsers and recent regulations like the GDPR may accelerate the transition. To counter browser fingerprinting, previous work proposed several techniques to detect its prevalence and severity. Howe… ▽ More

    Submitted 28 November, 2023; originally announced November 2023.

    Journal ref: Published in the Proceedings of the 31st Network and Distributed System Security Symposium (NDSS 2024), please cite accordingly

  10. arXiv:2308.05247  [pdf, other

    cs.SI cs.CR

    TUBERAIDER: Attributing Coordinated Hate Attacks on YouTube Videos to their Source Communities

    Authors: Mohammad Hammas Saeed, Kostantinos Papadamou, Jeremy Blackburn, Emiliano De Cristofaro, Gianluca Stringhini

    Abstract: Alas, coordinated hate attacks, or raids, are becoming increasingly common online. In a nutshell, these are perpetrated by a group of aggressors who organize and coordinate operations on a platform (e.g., 4chan) to target victims on another community (e.g., YouTube). In this paper, we focus on attributing raids to their source community, paving the way for moderation approaches that take the conte… ▽ More

    Submitted 22 June, 2024; v1 submitted 9 August, 2023; originally announced August 2023.

    Comments: Accepted for publication at the 18th International AAAI Conference on Web and Social Media (ICWSM 2024). Please cite accordingly

  11. arXiv:2305.10994  [pdf, other

    cs.LG cs.CR

    Graphical vs. Deep Generative Models: Measuring the Impact of Differentially Private Mechanisms and Budgets on Utility

    Authors: Georgi Ganev, Kai Xu, Emiliano De Cristofaro

    Abstract: Generative models trained with Differential Privacy (DP) can produce synthetic data while reducing privacy risks. However, navigating their privacy-utility tradeoffs makes finding the best models for specific settings/tasks challenging. This paper bridges this gap by profiling how DP generative models for tabular data distribute privacy budgets across rows and columns, which is one of the primary… ▽ More

    Submitted 28 August, 2024; v1 submitted 18 May, 2023; originally announced May 2023.

    Comments: A shorter version of this paper appears in the Proceedings of the 31st ACM Conference on Computer and Communications Security (ACM CCS 2024). This is the full version

  12. arXiv:2304.08847  [pdf, other

    cs.LG cs.CR

    BadVFL: Backdoor Attacks in Vertical Federated Learning

    Authors: Mohammad Naseri, Yufei Han, Emiliano De Cristofaro

    Abstract: Federated learning (FL) enables multiple parties to collaboratively train a machine learning model without sharing their data; rather, they train their own model locally and send updates to a central server for aggregation. Depending on how the data is distributed among the participants, FL can be classified into Horizontal (HFL) and Vertical (VFL). In VFL, the participants share the same set of t… ▽ More

    Submitted 23 August, 2023; v1 submitted 18 April, 2023; originally announced April 2023.

    Comments: Accepted for publication at the 45th IEEE Symposium on Security & Privacy (S&P 2024). Please cite accordingly

  13. arXiv:2303.07099  [pdf, other

    cs.CY cs.SI

    Beyond Fish and Bicycles: Exploring the Varieties of Online Women's Ideological Spaces

    Authors: Utkucan Balci, Chen Ling, Emiliano De Cristofaro, Megan Squire, Gianluca Stringhini, Jeremy Blackburn

    Abstract: The Internet has been instrumental in connecting under-represented and vulnerable groups of people. Platforms built to foster social interaction and engagement have enabled historically disenfranchised groups to have a voice. One such vulnerable group is women. In this paper, we explore the diversity in online women's ideological spaces using a multi-dimensional approach. We perform a large-scale,… ▽ More

    Submitted 13 March, 2023; originally announced March 2023.

    Journal ref: Published in the Proceedings of the 15th ACM Web Science Conference 2023 (ACM WebSci 2023). Please cite the WebSci version

  14. arXiv:2303.01230  [pdf, other

    cs.CR cs.AI cs.CY

    Synthetic Data: Methods, Use Cases, and Risks

    Authors: Emiliano De Cristofaro

    Abstract: Sharing data can often enable compelling applications and analytics. However, more often than not, valuable datasets contain information of a sensitive nature, and thus, sharing them can endanger the privacy of users and organizations. A possible alternative gaining momentum in both the research community and industry is to share synthetic data instead. The idea is to release artificially generate… ▽ More

    Submitted 27 February, 2024; v1 submitted 1 March, 2023; originally announced March 2023.

    Comments: To Appear in IEEE Security and Privacy Magazine

  15. arXiv:2212.05926  [pdf, other

    cs.CR cs.CY cs.SI

    LAMBRETTA: Learning to Rank for Twitter Soft Moderation

    Authors: Pujan Paudel, Jeremy Blackburn, Emiliano De Cristofaro, Savvas Zannettou, Gianluca Stringhini

    Abstract: To curb the problem of false information, social media platforms like Twitter started adding warning labels to content discussing debunked narratives, with the goal of providing more context to their audiences. Unfortunately, these labels are not applied uniformly and leave large amounts of false content unmoderated. This paper presents LAMBRETTA, a system that automatically identifies tweets that… ▽ More

    Submitted 12 December, 2022; originally announced December 2022.

    Comments: 44th IEEE Symposium on Security & Privacy (S&P 2023)

  16. arXiv:2211.14388  [pdf, other

    cs.CY cs.SI

    Non-Polar Opposites: Analyzing the Relationship Between Echo Chambers and Hostile Intergroup Interactions on Reddit

    Authors: Alexandros Efstratiou, Jeremy Blackburn, Tristan Caulfield, Gianluca Stringhini, Savvas Zannettou, Emiliano De Cristofaro

    Abstract: Previous research has documented the existence of both online echo chambers and hostile intergroup interactions. In this paper, we explore the relationship between these two phenomena by studying the activity of 5.97M Reddit users and 421M comments posted over 13 years. We examine whether users who are more engaged in echo chambers are more hostile when they comment on other communities. We then c… ▽ More

    Submitted 25 November, 2022; originally announced November 2022.

    Journal ref: 17th International AAAI Conference on Web and Social Media (ICWSM 2023). Please cite accordingly

  17. arXiv:2209.03463  [pdf, other

    cs.CY cs.AI cs.CR cs.SI

    Why So Toxic? Measuring and Triggering Toxic Behavior in Open-Domain Chatbots

    Authors: Wai Man Si, Michael Backes, Jeremy Blackburn, Emiliano De Cristofaro, Gianluca Stringhini, Savvas Zannettou, Yang Zhang

    Abstract: Chatbots are used in many applications, e.g., automated agents, smart home assistants, interactive characters in online games, etc. Therefore, it is crucial to ensure they do not behave in undesired manners, providing offensive or toxic responses to users. This is not a trivial task as state-of-the-art chatbot models are trained on large, public datasets openly collected from the Internet. This pa… ▽ More

    Submitted 9 September, 2022; v1 submitted 7 September, 2022; originally announced September 2022.

    Journal ref: Published in ACM CCS 2022. Please cite the CCS version

  18. arXiv:2209.03050  [pdf, other

    cs.CR cs.AI

    Cerberus: Exploring Federated Prediction of Security Events

    Authors: Mohammad Naseri, Yufei Han, Enrico Mariconti, Yun Shen, Gianluca Stringhini, Emiliano De Cristofaro

    Abstract: Modern defenses against cyberattacks increasingly rely on proactive approaches, e.g., to predict the adversary's next actions based on past events. Building accurate prediction models requires knowledge from many organizations; alas, this entails disclosing sensitive information, such as network structures, security postures, and policies, which might often be undesirable or outright impossible. I… ▽ More

    Submitted 7 September, 2022; originally announced September 2022.

    Journal ref: Proceedings of the 29th ACM Conference on Computer and Communications Security (ACM CCS 2022)

  19. arXiv:2206.15237  [pdf, other

    cs.CY cs.SI

    Adherence to Misinformation on Social Media Through Socio-Cognitive and Group-Based Processes

    Authors: Alexandros Efstratiou, Emiliano De Cristofaro

    Abstract: Previous work suggests that people's preference for different kinds of information depends on more than just accuracy. This could happen because the messages contained within different pieces of information may either be well-liked or repulsive. Whereas factual information must often convey uncomfortable truths, misinformation can have little regard for veracity and leverage psychological processe… ▽ More

    Submitted 30 June, 2022; originally announced June 2022.

    Journal ref: 25th ACM Conference on Computer Supported Cooperative Work and Social Computing (CSCW 2022)

  20. arXiv:2204.12709  [pdf, other

    cs.CY cs.NI

    Toxicity in the Decentralized Web and the Potential for Model Sharing

    Authors: Haris Bin Zia, Aravindh. Raman, Ignacio Castro, Ishaku Hassan Anaobi, Emiliano De Cristofaro, Nishanth Sastry, Gareth Tyson

    Abstract: The "Decentralised Web" (DW) is an evolving concept, which encompasses technologies aimed at providing greater transparency and openness on the web. The DW relies on independent servers (aka instances) that mesh together in a peer-to-peer fashion to deliver a range of services (e.g. micro-blogs, image sharing, video streaming). However, toxic content moderation in this decentralised context is cha… ▽ More

    Submitted 27 April, 2022; originally announced April 2022.

    Journal ref: Published in the Proceedings of the 2022 ACM International Conference on Measurement and Modeling of Computer Systems (SIGMETRICS'22). Please cite accordingly

  21. arXiv:2202.08492  [pdf, other

    cs.CY cs.CV

    Feels Bad Man: Dissecting Automated Hateful Meme Detection Through the Lens of Facebook's Challenge

    Authors: Catherine Jennifer, Fatemeh Tahmasbi, Jeremy Blackburn, Gianluca Stringhini, Savvas Zannettou, Emiliano De Cristofaro

    Abstract: Internet memes have become a dominant method of communication; at the same time, however, they are also increasingly being used to advocate extremism and foster derogatory beliefs. Nonetheless, we do not have a firm understanding as to which perceptual aspects of memes cause this phenomenon. In this work, we assess the efficacy of current state-of-the-art multimodal machine learning models toward… ▽ More

    Submitted 17 February, 2022; originally announced February 2022.

  22. arXiv:2112.00443  [pdf, other

    cs.CR cs.CY cs.SI

    TROLLMAGNIFIER: Detecting State-Sponsored Troll Accounts on Reddit

    Authors: Mohammad Hammas Saeed, Shiza Ali, Jeremy Blackburn, Emiliano De Cristofaro, Savvas Zannettou, Gianluca Stringhini

    Abstract: Growing evidence points to recurring influence campaigns on social media, often sponsored by state actors aiming to manipulate public opinion on sensitive political topics. Typically, campaigns are performed through instrumented accounts, known as troll accounts; despite their prominence, however, little work has been done to detect these accounts in the wild. In this paper, we present TROLLMAGNIF… ▽ More

    Submitted 1 December, 2021; originally announced December 2021.

  23. arXiv:2111.02455  [pdf, other

    cs.DL cs.SI

    Understanding the Use of e-Prints on Reddit and 4chan's Politically Incorrect Board

    Authors: Satrio Baskoro Yudhoatmojo, Emiliano De Cristofaro, Jeremy Blackburn

    Abstract: The dissemination and reach of scientific knowledge have increased at a blistering pace. In this context, e-Print servers have played a central role by providing scientists with a rapid and open mechanism for disseminating research without waiting for the (lengthy) peer review process. While helping the scientific community in several ways, e-Print servers also provide scientific communicators and… ▽ More

    Submitted 8 March, 2023; v1 submitted 3 November, 2021; originally announced November 2021.

    Journal ref: Published in the Proceedings of the 15th ACM Web Science Conference 2023 (ACM WebSci 2023). Please cite the WebSci version

  24. arXiv:2111.02452  [pdf, other

    cs.CY cs.CV

    Slapping Cats, Bopping Heads, and Oreo Shakes: Understanding Indicators of Virality in TikTok Short Videos

    Authors: Chen Ling, Jeremy Blackburn, Emiliano De Cristofaro, Gianluca Stringhini

    Abstract: Short videos have become one of the leading media used by younger generations to express themselves online and thus a driving force in shaping online culture. In this context, TikTok has emerged as a platform where viral videos are often posted first. In this paper, we study what elements of short videos posted on TikTok contribute to their virality. We apply a mixed-method approach to develop a c… ▽ More

    Submitted 3 November, 2021; originally announced November 2021.

  25. arXiv:2111.02187  [pdf, other

    cs.SI cs.CY

    Soros, Child Sacrifices, and 5G: Understanding the Spread of Conspiracy Theories on Web Communities

    Authors: Pujan Paudel, Jeremy Blackburn, Emiliano De Cristofaro, Savvas Zannettou, Gianluca Stringhini

    Abstract: This paper presents a multi-platform computational pipeline geared to identify social media posts discussing (known) conspiracy theories. We use 189 conspiracy claims collected by Snopes, and find 66k posts and 277k comments on Reddit, and 379k tweets discussing them. Then, we study how conspiracies are discussed on different Web communities and which ones are particularly influential in driving t… ▽ More

    Submitted 3 November, 2021; originally announced November 2021.

  26. arXiv:2110.13500  [pdf, other

    cs.CY

    Exploring Content Moderation in the Decentralised Web: The Pleroma Case

    Authors: Anaobi Ishaku Hassan, Aravindh Raman, Ignacio Castro, Haris Bin Zia, Emiliano De Cristofaro, Nishanth Sastry, Gareth Tyson

    Abstract: Decentralising the Web is a desirable but challenging goal. One particular challenge is achieving decentralised content moderation in the face of various adversaries (e.g. trolls). To overcome this challenge, many Decentralised Web (DW) implementations rely on federation policies. Administrators use these policies to create rules that ban or modify content that matches specific rules. This, howeve… ▽ More

    Submitted 30 October, 2021; v1 submitted 26 October, 2021; originally announced October 2021.

    Journal ref: Proceedings of the 17th International Conference on emerging Networking EXperiments and Technologies (ACM CoNext 2021)

  27. arXiv:2109.11429  [pdf, other

    cs.LG cs.AI cs.CR cs.CY

    Robin Hood and Matthew Effects: Differential Privacy Has Disparate Impact on Synthetic Data

    Authors: Georgi Ganev, Bristena Oprisanu, Emiliano De Cristofaro

    Abstract: Generative models trained with Differential Privacy (DP) can be used to generate synthetic data while minimizing privacy risks. We analyze the impact of DP on these models vis-a-vis underrepresented classes/subgroups of data, specifically, studying: 1) the size of classes/subgroups in the synthetic data, and 2) the accuracy of classification tasks run on them. We also evaluate the effect of variou… ▽ More

    Submitted 26 June, 2022; v1 submitted 23 September, 2021; originally announced September 2021.

    Journal ref: Proceedings of the 39th International Conference on Machine Learning (ICML 2022)

  28. arXiv:2108.05876  [pdf, other

    cs.CY cs.SI

    An Early Look at the Gettr Social Network

    Authors: Pujan Paudel, Jeremy Blackburn, Emiliano De Cristofaro, Savvas Zannettou, Gianluca Stringhini

    Abstract: This paper presents the first data-driven analysis of Gettr, a new social network platform launched by former US President Donald Trump's team. Among other things, we find that users on the platform heavily discuss politics, with a focus on the Trump campaign in the US and Bolsonaro's in Brazil. Activity on the platform has steadily been decreasing since its launch, although a core of verified use… ▽ More

    Submitted 12 August, 2021; originally announced August 2021.

  29. arXiv:2104.11145  [pdf, other

    cs.CY

    "I'm a Professor, which isn't usually a dangerous job": Internet-Facilitated Harassment and its Impact on Researchers

    Authors: Periwinkle Doerfler, Andrea Forte, Emiliano De Cristofaro, Gianluca Stringhini, Jeremy Blackburn, Damon McCoy

    Abstract: While the Internet has dramatically increased the exposure that research can receive, it has also facilitated harassment against scholars. To understand the impact that these attacks can have on the work of researchers, we perform a series of systematic interviews with researchers including academics, journalists, and activists, who have experienced targeted, Internet-facilitated harassment. We pr… ▽ More

    Submitted 22 April, 2021; v1 submitted 22 April, 2021; originally announced April 2021.

  30. arXiv:2103.03631  [pdf, other

    cs.CY cs.SI

    A Multi-Platform Analysis of Political News Discussion and Sharing on Web Communities

    Authors: Yuping Wang, Savvas Zannettou, Jeremy Blackburn, Barry Bradlyn, Emiliano De Cristofaro, Gianluca Stringhini

    Abstract: The news ecosystem has become increasingly complex, encompassing a wide range of sources with varying levels of trustworthiness, and with public commentary giving different spins to the same stories. In this paper, we present a multi-platform measurement of this ecosystem. We compile a list of 1,073 news websites and extract posts from four Web communities (Twitter, Reddit, 4chan, and Gab) that co… ▽ More

    Submitted 5 March, 2021; originally announced March 2021.

  31. arXiv:2102.03314  [pdf, other

    q-bio.GN cs.AI cs.CR

    On Utility and Privacy in Synthetic Genomic Data

    Authors: Bristena Oprisanu, Georgi Ganev, Emiliano De Cristofaro

    Abstract: The availability of genomic data is essential to progress in biomedical research, personalized medicine, etc. However, its extreme sensitivity makes it problematic, if not outright impossible, to publish or share it. As a result, several initiatives have been launched to experiment with synthetic genomic data, e.g., using generative models to learn the underlying distribution of the real data and… ▽ More

    Submitted 18 January, 2022; v1 submitted 5 February, 2021; originally announced February 2021.

    Comments: Published in the Proceedings of the 29th Network and Distributed System Security Symposium (NDSS 2022)

  32. arXiv:2102.02551  [pdf, other

    cs.CR cs.AI cs.LG stat.ML

    ML-Doctor: Holistic Risk Assessment of Inference Attacks Against Machine Learning Models

    Authors: Yugeng Liu, Rui Wen, Xinlei He, Ahmed Salem, Zhikun Zhang, Michael Backes, Emiliano De Cristofaro, Mario Fritz, Yang Zhang

    Abstract: Inference attacks against Machine Learning (ML) models allow adversaries to learn sensitive information about training data, model parameters, etc. While researchers have studied, in depth, several kinds of attacks, they have done so in isolation. As a result, we lack a comprehensive picture of the risks caused by the attacks, e.g., the different scenarios they can be applied to, the common factor… ▽ More

    Submitted 6 October, 2021; v1 submitted 4 February, 2021; originally announced February 2021.

  33. arXiv:2101.08750  [pdf, other

    cs.CY cs.SI

    The Gospel According to Q: Understanding the QAnon Conspiracy from the Perspective of Canonical Information

    Authors: Antonis Papasavva, Max Aliapoulios, Cameron Ballard, Emiliano De Cristofaro, Gianluca Stringhini, Savvas Zannettou, Jeremy Blackburn

    Abstract: The QAnon conspiracy theory claims that a cabal of (literally) blood-thirsty politicians and media personalities are engaged in a war to destroy society. By interpreting cryptic "drops" of information from an anonymous insider calling themself Q, adherents of the conspiracy theory believe that Donald Trump is leading them in an active fight against this cabal. QAnon has been covered extensively by… ▽ More

    Submitted 29 April, 2022; v1 submitted 21 January, 2021; originally announced January 2021.

    Journal ref: Published in the Proceedings of the 16th International AAAI Conference on Web and Social Media (ICWSM 2022). Please cite accordingly

  34. arXiv:2101.06535  [pdf, other

    cs.HC cs.CY cs.SI

    Dissecting the Meme Magic: Understanding Indicators of Virality in Image Memes

    Authors: Chen Ling, Ihab AbuHilal, Jeremy Blackburn, Emiliano De Cristofaro, Savvas Zannettou, Gianluca Stringhini

    Abstract: Despite the increasingly important role played by image memes, we do not yet have a solid understanding of the elements that might make a meme go viral on social media. In this paper, we investigate what visual elements distinguish image memes that are highly viral on social media from those that do not get re-shared, across three dimensions: composition, subjects, and target audience. Drawing fro… ▽ More

    Submitted 16 January, 2021; originally announced January 2021.

    Comments: To appear at the 24th ACM Conference on Computer-Supported Coop- erative Work and Social Computing (CSCW 2021)

  35. arXiv:2101.03820  [pdf, other

    cs.SI cs.CY physics.soc-ph

    An Early Look at the Parler Online Social Network

    Authors: Max Aliapoulios, Emmi Bevensee, Jeremy Blackburn, Barry Bradlyn, Emiliano De Cristofaro, Gianluca Stringhini, Savvas Zannettou

    Abstract: Parler is as an "alternative" social network promoting itself as a service that allows to "speak freely and express yourself openly, without fear of being deplatformed for your views." Because of this promise, the platform become popular among users who were suspended on mainstream social networks for violating their terms of service, as well as those fearing censorship. In particular, the service… ▽ More

    Submitted 18 February, 2021; v1 submitted 11 January, 2021; originally announced January 2021.

    Journal ref: Proceedings of the International AAAI Conference on Web and Social Media, 15(1), 943--951 (2021)

  36. arXiv:2010.11638  [pdf, other

    cs.CY cs.SI

    "It is just a flu": Assessing the Effect of Watch History on YouTube's Pseudoscientific Video Recommendations

    Authors: Kostantinos Papadamou, Savvas Zannettou, Jeremy Blackburn, Emiliano De Cristofaro, Gianluca Stringhini, Michael Sirivianos

    Abstract: The role played by YouTube's recommendation algorithm in unwittingly promoting misinformation and conspiracy theories is not entirely understood. Yet, this can have dire real-world consequences, especially when pseudoscientific content is promoted to users at critical times, such as the COVID-19 pandemic. In this paper, we set out to characterize and detect pseudoscientific misinformation on YouTu… ▽ More

    Submitted 12 October, 2021; v1 submitted 22 October, 2020; originally announced October 2020.

    Comments: To appear at the 16th International Conference on Web and Social Media (ICWSM 2022). Please cite the ICWSM version

  37. Do Platform Migrations Compromise Content Moderation? Evidence from r/The_Donald and r/Incels

    Authors: Manoel Horta Ribeiro, Shagun Jhaver, Savvas Zannettou, Jeremy Blackburn, Emiliano De Cristofaro, Gianluca Stringhini, Robert West

    Abstract: When toxic online communities on mainstream platforms face moderation measures, such as bans, they may migrate to other platforms with laxer policies or set up their own dedicated websites. Previous work suggests that within mainstream platforms, community-level moderation is effective in mitigating the harm caused by the moderated communities. It is, however, unclear whether these results also ho… ▽ More

    Submitted 20 August, 2021; v1 submitted 20 October, 2020; originally announced October 2020.

    Comments: This paper has been accepted at CSCW 2021, please cite accordingly

  38. arXiv:2009.11792  [pdf, other

    cs.CY

    Understanding the Use of Fauxtography on Social Media

    Authors: Yuping Wang, Fatemeh Tahmasbi, Jeremy Blackburn, Barry Bradlyn, Emiliano De Cristofaro, David Magerman, Savvas Zannettou, Gianluca Stringhini

    Abstract: Despite the influence that image-based communication has on online discourse, the role played by images in disinformation is still not well understood. In this paper, we present the first large-scale study of fauxtography, analyzing the use of manipulated or misleading images in news discussion on online communities. First, we develop a computational pipeline geared to detect fauxtography, and ide… ▽ More

    Submitted 25 September, 2020; v1 submitted 24 September, 2020; originally announced September 2020.

  39. arXiv:2009.04885  [pdf, other

    cs.CY

    "Is it a Qoincidence?": An Exploratory Study of QAnon on Voat

    Authors: Antonis Papasavva, Jeremy Blackburn, Gianluca Stringhini, Savvas Zannettou, Emiliano De Cristofaro

    Abstract: Online fringe communities offer fertile grounds for users seeking and sharing ideas fueling suspicion of mainstream news and conspiracy theories. Among these, the QAnon conspiracy theory emerged in 2017 on 4chan, broadly supporting the idea that powerful politicians, aristocrats, and celebrities are closely engaged in a global pedophile ring. Simultaneously, governments are thought to be controlle… ▽ More

    Submitted 14 February, 2021; v1 submitted 10 September, 2020; originally announced September 2020.

    Journal ref: Published in the Proceedings of 30th The Web Conference (WWW 2021). Please cite the WWW version

  40. arXiv:2009.03561  [pdf, other

    cs.CR cs.AI

    Local and Central Differential Privacy for Robustness and Privacy in Federated Learning

    Authors: Mohammad Naseri, Jamie Hayes, Emiliano De Cristofaro

    Abstract: Federated Learning (FL) allows multiple participants to train machine learning models collaboratively by keeping their datasets local while only exchanging model updates. Alas, this is not necessarily free from privacy and robustness vulnerabilities, e.g., via membership, property, and backdoor attacks. This paper investigates whether and to what extent one can use differential Privacy (DP) to pro… ▽ More

    Submitted 27 May, 2022; v1 submitted 8 September, 2020; originally announced September 2020.

    Journal ref: Published in the Proceedings of the 29th Network and Distributed System Security Symposium (NDSS 2022)

  41. arXiv:2005.08679  [pdf, other

    cs.LG cs.AI cs.CR cs.CY stat.ML

    An Overview of Privacy in Machine Learning

    Authors: Emiliano De Cristofaro

    Abstract: Over the past few years, providers such as Google, Microsoft, and Amazon have started to provide customers with access to software interfaces allowing them to easily embed machine learning tasks into their applications. Overall, organizations can now use Machine Learning as a Service (MLaaS) engines to outsource complex tasks, e.g., training classifiers, performing predictions, clustering, etc. Th… ▽ More

    Submitted 18 May, 2020; originally announced May 2020.

  42. "How over is it?" Understanding the Incel Community on YouTube

    Authors: Kostantinos Papadamou, Savvas Zannettou, Jeremy Blackburn, Emiliano De Cristofaro, Gianluca Stringhini, Michael Sirivianos

    Abstract: YouTube is by far the largest host of user-generated video content worldwide. Alas, the platform has also come under fire for hosting inappropriate, toxic, and hateful content. One community that has often been linked to sharing and publishing hateful and misogynistic content are the Involuntary Celibates (Incels), a loosely defined movement ostensibly focusing on men's issues. In this paper, we s… ▽ More

    Submitted 23 August, 2021; v1 submitted 22 January, 2020; originally announced January 2020.

    Comments: To appear at the 24th ACM Conference on Computer-Supported Cooperative Work and Social Computing (CSCW 2021). Please cite the CSCW version

  43. arXiv:2001.07600  [pdf, other

    cs.CY

    The Evolution of the Manosphere Across the Web

    Authors: Manoel Horta Ribeiro, Jeremy Blackburn, Barry Bradlyn, Emiliano De Cristofaro, Gianluca Stringhini, Summer Long, Stephanie Greenberg, Savvas Zannettou

    Abstract: In this paper, we present a large-scale characterization of the Manosphere, a conglomerate of Web-based misogynist movements roughly focused on "men's issues," which has seen significant growth over the past years. We do so by gathering and analyzing 28.8M posts from 6 forums and 51 subreddits. Overall, we paint a comprehensive picture of the evolution of the Manosphere on the Web, showing the lin… ▽ More

    Submitted 8 April, 2021; v1 submitted 21 January, 2020; originally announced January 2020.

    Comments: To appear at the 15th International AAAI Conference on Web and Social Media (ICWSM 2021) -- please cite accordingly

  44. arXiv:2001.07487  [pdf, other

    cs.CY cs.SI

    Raiders of the Lost Kek: 3.5 Years of Augmented 4chan Posts from the Politically Incorrect Board

    Authors: Antonis Papasavva, Savvas Zannettou, Emiliano De Cristofaro, Gianluca Stringhini, Jeremy Blackburn

    Abstract: This paper presents a dataset with over 3.3M threads and 134.5M posts from the Politically Incorrect board (/pol/) of the imageboard forum 4chan, posted over a period of almost 3.5 years (June 2016-November 2019). To the best of our knowledge, this represents the largest publicly available 4chan dataset, providing the community with an archive of posts that have been permanently deleted from 4chan… ▽ More

    Submitted 1 April, 2020; v1 submitted 21 January, 2020; originally announced January 2020.

    Journal ref: Published at the 14th International AAAI Conference on Web and Social Media (ICWSM 2020). Please cite the ICWSM version

  45. arXiv:2001.07157  [pdf, other

    cs.CR

    On the Feasibility of Acoustic Attacks Using Commodity Smart Devices

    Authors: Matt Wixey, Shane Johnson, Emiliano De Cristofaro

    Abstract: Sound at frequencies above (ultrasonic) or below (infrasonic) the range of human hearing can, in some settings, cause adverse physiological and psychological effects to individuals. In this paper, we investigate the feasibility of cyber-attacks that could make smart consumer devices produce possibly imperceptible sound at both high (17-21kHz) and low (60-100Hz) frequencies, at the maximum availabl… ▽ More

    Submitted 20 January, 2020; originally announced January 2020.

  46. arXiv:1909.05801  [pdf, other

    cs.NI cs.CR cs.CY

    Challenges in the Decentralised Web: The Mastodon Case

    Authors: Aravindh Raman, Sagar Joglekar, Emiliano De Cristofaro, Nishanth Sastry, Gareth Tyson

    Abstract: The Decentralised Web (DW) has recently seen a renewed momentum, with a number of DW platforms like Mastodon, Peer-Tube, and Hubzilla gaining increasing traction. These offer alternatives to traditional social networks like Twitter, YouTube, and Facebook, by enabling the operation of web infrastructure and services without centralised ownership or control. Although their services differ greatly, m… ▽ More

    Submitted 12 September, 2019; originally announced September 2019.

    Journal ref: Proceedings of 19th ACM Internet Measurement Conference (IMC 2019)

  47. arXiv:1908.11315  [pdf, other

    cs.CR

    How Much Does GenoGuard Really "Guard"? An Empirical Analysis of Long-Term Security for Genomic Data

    Authors: Bristena Oprisanu, Christophe Dessimoz, Emiliano De Cristofaro

    Abstract: Due to its hereditary nature, genomic data is not only linked to its owner but to that of close relatives as well. As a result, its sensitivity does not really degrade over time; in fact, the relevance of a genomic sequence is likely to be longer than the security provided by encryption. This prompts the need for specialized techniques providing long-term security for genomic data, yet the only av… ▽ More

    Submitted 29 August, 2019; originally announced August 2019.

    Journal ref: Proceedings of the 18th ACM CCS Workshop on Privacy in the Electronic Society (WPES 2019)

  48. arXiv:1907.08873  [pdf, other

    cs.SI cs.CY cs.IR

    Detecting Cyberbullying and Cyberaggression in Social Media

    Authors: Despoina Chatzakou, Ilias Leontiadis, Jeremy Blackburn, Emiliano De Cristofaro, Gianluca Stringhini, Athena Vakali, Nicolas Kourtellis

    Abstract: Cyberbullying and cyberaggression are increasingly worrisome phenomena affecting people across all demographics. More than half of young social media users worldwide have been exposed to such prolonged and/or coordinated digital harassment. Victims can experience a wide range of emotions, with negative consequences such as embarrassment, depression, isolation from other community members, which em… ▽ More

    Submitted 20 July, 2019; originally announced July 2019.

    Comments: To appear in ACM Transactions on the Web (TWEB)

  49. arXiv:1902.07456  [pdf, other

    cs.CR

    Measuring Membership Privacy on Aggregate Location Time-Series

    Authors: Apostolos Pyrgelis, Carmela Troncoso, Emiliano De Cristofaro

    Abstract: While location data is extremely valuable for various applications, disclosing it prompts serious threats to individuals' privacy. To limit such concerns, organizations often provide analysts with aggregate time-series that indicate, e.g., how many people are in a location at a time interval, rather than raw individual traces. In this paper, we perform a measurement study to understand Membership… ▽ More

    Submitted 27 April, 2020; v1 submitted 20 February, 2019; originally announced February 2019.

    Journal ref: Presented at ACM SIGMETRICS 2020 and published in the Proceedings of the ACM on Measurement and Analysis of Computing Systems (POMACS), Vol. 2, No. 4, Article 36, June 2020

  50. arXiv:1901.09735  [pdf, other

    cs.CY

    "And We Will Fight For Our Race!" A Measurement Study of Genetic Testing Conversations on Reddit and 4chan

    Authors: Alexandros Mittos, Savvas Zannettou, Jeremy Blackburn, Emiliano De Cristofaro

    Abstract: Progress in genomics has enabled the emergence of a booming market for "direct-to-consumer" genetic testing. Nowadays, companies like 23andMe and AncestryDNA provide affordable health, genealogy, and ancestry reports, and have already tested tens of millions of customers. At the same time, alt- and far-right groups have also taken an interest in genetic testing, using them to attack minorities and… ▽ More

    Submitted 4 October, 2019; v1 submitted 28 January, 2019; originally announced January 2019.

    Comments: This is the full version of the paper, with same title, appearing in the 14th AAAI Conference on Web and Social Media (ICWSM 2020). Please cite the ICWSM version