Skip to main content

Showing 1–50 of 77 results for author: Hayes, J

Searching in archive cs. Search in all archives.
.
  1. arXiv:2408.07009  [pdf, other

    cs.CV

    Imagen 3

    Authors: Imagen-Team-Google, :, Jason Baldridge, Jakob Bauer, Mukul Bhutani, Nicole Brichtova, Andrew Bunner, Kelvin Chan, Yichang Chen, Sander Dieleman, Yuqing Du, Zach Eaton-Rosen, Hongliang Fei, Nando de Freitas, Yilin Gao, Evgeny Gladchenko, Sergio Gómez Colmenarejo, Mandy Guo, Alex Haig, Will Hawkins, Hexiang Hu, Huilian Huang, Tobenna Peter Igwe, Christos Kaplanis, Siavash Khodadadeh , et al. (227 additional authors not shown)

    Abstract: We introduce Imagen 3, a latent diffusion model that generates high quality images from text prompts. We describe our quality and responsibility evaluations. Imagen 3 is preferred over other state-of-the-art (SOTA) models at the time of evaluation. In addition, we discuss issues around safety and representation, as well as methods we used to minimize the potential harm of our models.

    Submitted 13 August, 2024; originally announced August 2024.

  2. arXiv:2407.20559  [pdf, ps, other

    cs.LO

    Practical Rely/Guarantee Verification of an Efficient Lock for seL4 on Multicore Architectures

    Authors: Robert J. Colvin, Ian J. Hayes, Scott Heiner, Peter Höfner, Larissa Meinicke, Roger C. Su

    Abstract: Developers of low-level systems code providing core functionality for operating systems and kernels must address hardware-level features of modern multicore architectures. A particular feature is pipelined "out-of-order execution" of the code as written, the effects of which are typically summarised as a "weak memory model" - a term which includes further complicating factors that may be introduce… ▽ More

    Submitted 30 July, 2024; originally announced July 2024.

  3. arXiv:2407.00106  [pdf, other

    cs.LG cs.AI cs.CL cs.CR

    UnUnlearning: Unlearning is not sufficient for content regulation in advanced generative AI

    Authors: Ilia Shumailov, Jamie Hayes, Eleni Triantafillou, Guillermo Ortiz-Jimenez, Nicolas Papernot, Matthew Jagielski, Itay Yona, Heidi Howard, Eugene Bagdasaryan

    Abstract: Exact unlearning was first introduced as a privacy mechanism that allowed a user to retract their data from machine learning models on request. Shortly after, inexact schemes were proposed to mitigate the impractical costs associated with exact unlearning. More recently unlearning is often discussed as an approach for removal of impermissible knowledge i.e. knowledge that the model should not poss… ▽ More

    Submitted 27 June, 2024; originally announced July 2024.

  4. arXiv:2406.11715  [pdf, other

    cs.LG cs.CL cs.SE

    Measuring memorization in RLHF for code completion

    Authors: Aneesh Pappu, Billy Porter, Ilia Shumailov, Jamie Hayes

    Abstract: Reinforcement learning with human feedback (RLHF) has become the dominant method to align large models to user preferences. Unlike fine-tuning, for which there are many studies regarding training data memorization, it is not clear how memorization is affected by or introduced in the RLHF alignment process. Understanding this relationship is important as real user data may be collected and used to… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

  5. arXiv:2406.10011  [pdf, other

    cs.LG cs.AI cs.CR

    Beyond Slow Signs in High-fidelity Model Extraction

    Authors: Hanna Foerster, Robert Mullins, Ilia Shumailov, Jamie Hayes

    Abstract: Deep neural networks, costly to train and rich in intellectual property value, are increasingly threatened by model extraction attacks that compromise their confidentiality. Previous attacks have succeeded in reverse-engineering model parameters up to a precision of float64 for models trained on random data with at most three hidden layers using cryptanalytical techniques. However, the process was… ▽ More

    Submitted 14 June, 2024; originally announced June 2024.

  6. arXiv:2406.09073  [pdf, other

    cs.LG

    Are we making progress in unlearning? Findings from the first NeurIPS unlearning competition

    Authors: Eleni Triantafillou, Peter Kairouz, Fabian Pedregosa, Jamie Hayes, Meghdad Kurmanji, Kairan Zhao, Vincent Dumoulin, Julio Jacques Junior, Ioannis Mitliagkas, Jun Wan, Lisheng Sun Hosoya, Sergio Escalera, Gintare Karolina Dziugaite, Peter Triantafillou, Isabelle Guyon

    Abstract: We present the findings of the first NeurIPS competition on unlearning, which sought to stimulate the development of novel algorithms and initiate discussions on formal and robust evaluation methodologies. The competition was highly successful: nearly 1,200 teams from across the world participated, and a wealth of novel, imaginative solutions with different characteristics were contributed. In thi… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.

  7. arXiv:2406.08918  [pdf, other

    cs.CR cs.AI cs.LG math.ST stat.ML

    Beyond the Calibration Point: Mechanism Comparison in Differential Privacy

    Authors: Georgios Kaissis, Stefan Kolek, Borja Balle, Jamie Hayes, Daniel Rueckert

    Abstract: In differentially private (DP) machine learning, the privacy guarantees of DP mechanisms are often reported and compared on the basis of a single $(\varepsilon, δ)$-pair. This practice overlooks that DP guarantees can vary substantially even between mechanisms sharing a given $(\varepsilon, δ)$, and potentially introduces privacy vulnerabilities which can remain undetected. This motivates the need… ▽ More

    Submitted 10 July, 2024; v1 submitted 13 June, 2024; originally announced June 2024.

    Comments: ICML 2024

  8. arXiv:2405.20990  [pdf, other

    cs.CR cs.AI cs.LG

    Locking Machine Learning Models into Hardware

    Authors: Eleanor Clifford, Adhithya Saravanan, Harry Langford, Cheng Zhang, Yiren Zhao, Robert Mullins, Ilia Shumailov, Jamie Hayes

    Abstract: Modern Machine Learning models are expensive IP and business competitiveness often depends on keeping this IP confidential. This in turn restricts how these models are deployed -- for example it is unclear how to deploy a model on-device without inevitably leaking the underlying model. At the same time, confidential computing technologies such as Multi-Party Computation or Homomorphic encryption r… ▽ More

    Submitted 31 May, 2024; originally announced May 2024.

    Comments: 10 pages, 2 figures of main text; 14 pages, 16 figures of appendices

  9. arXiv:2405.05690  [pdf, other

    cs.LO

    Restructuring a concurrent refinement algebra

    Authors: Ian J. Hayes, Larissa A. Meinicke, Naso Evangelou-Oost

    Abstract: The concurrent refinement algebra has been developed to support rely/guarantee reasoning about concurrent programs. The algebra supports atomic commands and defines parallel composition as a synchronous operation, as in Milner's SCCS. In order to allow specifications to be combined, the algebra also provides a weak conjunction operation, which is also a synchronous operation that shares many prope… ▽ More

    Submitted 9 May, 2024; originally announced May 2024.

    ACM Class: F.3.1; D.1.3

  10. arXiv:2405.05546  [pdf, other

    cs.LO cs.SE

    Data reification in a concurrent rely-guarantee algebra

    Authors: Larissa A. Meinicke, Ian J. Hayes, Cliff B. Jones

    Abstract: Specifications of significant systems can be made short and perspicuous by using abstract data types; data reification can provide a clear, stepwise, development history of programs that use more efficient concrete representations. Data reification (or "refinement") techniques for sequential programs are well established. This paper applies these ideas to concurrency, in particular, an algebraic t… ▽ More

    Submitted 9 May, 2024; originally announced May 2024.

    ACM Class: F.3.1; D.1.3

  11. arXiv:2403.13425  [pdf, other

    cs.LO

    Reasoning about distributive laws in a concurrent refinement algebra

    Authors: Larissa A. Meinicke, Ian J. Hayes

    Abstract: Distributive laws are important for algebraic reasoning in arithmetic and logic. They are equally important for algebraic reasoning about concurrent programs. In existing theories such as Concurrent Kleene Algebra, only partial correctness is handled, and many of its distributive laws are weak, in the sense that they are only refinements in one direction, rather than equalities. The focus of this… ▽ More

    Submitted 20 March, 2024; originally announced March 2024.

    Comments: 20 pages, 1 Figure

    ACM Class: F.3.1; D.1.3

  12. arXiv:2403.01218  [pdf, other

    cs.LG cs.CR

    Inexact Unlearning Needs More Careful Evaluations to Avoid a False Sense of Privacy

    Authors: Jamie Hayes, Ilia Shumailov, Eleni Triantafillou, Amr Khalifa, Nicolas Papernot

    Abstract: The high cost of model training makes it increasingly desirable to develop techniques for unlearning. These techniques seek to remove the influence of a training example without having to retrain the model from scratch. Intuitively, once a model has unlearned, an adversary that interacts with the model should no longer be able to tell whether the unlearned example was included in the model's train… ▽ More

    Submitted 21 May, 2024; v1 submitted 2 March, 2024; originally announced March 2024.

  13. arXiv:2402.05526  [pdf, other

    cs.CR cs.LG

    Buffer Overflow in Mixture of Experts

    Authors: Jamie Hayes, Ilia Shumailov, Itay Yona

    Abstract: Mixture of Experts (MoE) has become a key ingredient for scaling large foundation models while keeping inference costs steady. We show that expert routing strategies that have cross-batch dependencies are vulnerable to attacks. Malicious queries can be sent to a model and can affect a model's output on other benign queries if they are grouped in the same batch. We demonstrate this via a proof-of-c… ▽ More

    Submitted 8 February, 2024; originally announced February 2024.

  14. arXiv:2308.10888  [pdf, other

    cs.LG cs.CV cs.CY

    Unlocking Accuracy and Fairness in Differentially Private Image Classification

    Authors: Leonard Berrada, Soham De, Judy Hanwen Shen, Jamie Hayes, Robert Stanforth, David Stutz, Pushmeet Kohli, Samuel L. Smith, Borja Balle

    Abstract: Privacy-preserving machine learning aims to train models on private data without leaking sensitive information. Differential privacy (DP) is considered the gold standard framework for privacy-preserving training, as it provides formal privacy guarantees. However, compared to their non-private counterparts, models trained with DP often have significantly reduced accuracy. Private classifiers are al… ▽ More

    Submitted 21 August, 2023; originally announced August 2023.

  15. arXiv:2307.03928  [pdf, other

    cs.CR cs.AI

    Bounding data reconstruction attacks with the hypothesis testing interpretation of differential privacy

    Authors: Georgios Kaissis, Jamie Hayes, Alexander Ziller, Daniel Rueckert

    Abstract: We explore Reconstruction Robustness (ReRo), which was recently proposed as an upper bound on the success of data reconstruction attacks against machine learning models. Previous research has demonstrated that differential privacy (DP) mechanisms also provide ReRo, but so far, only asymptotic Monte Carlo estimates of a tight ReRo bound have been shown. Directly computable ReRo bounds for general D… ▽ More

    Submitted 8 July, 2023; originally announced July 2023.

  16. Trace models of concurrent valuation algebras

    Authors: Naso Evangelou-Oost, Larissa Meinicke, Callum Bannister, Ian J. Hayes

    Abstract: This paper introduces Concurrent Valuation Algebras (CVAs), a novel extension of ordered valuation algebras (OVAs). CVAs include two combine operators representing parallel and sequential products, adhering to a weak exchange law. This development offers theoretical and practical benefits for the specification and modelling of concurrent and distributed systems. As a presheaf on a space of domains… ▽ More

    Submitted 21 August, 2023; v1 submitted 29 May, 2023; originally announced May 2023.

    Comments: 26 pages

    Journal ref: Formal Methods and Software Engineering. ICFEM 2023. Lecture Notes in Computer Science, vol 14308. Springer, Singapore

  17. arXiv:2303.16867  [pdf, other

    cs.CV

    A Video-based End-to-end Pipeline for Non-nutritive Sucking Action Recognition and Segmentation in Young Infants

    Authors: Shaotong Zhu, Michael Wan, Elaheh Hatamimajoumerd, Kashish Jain, Samuel Zlota, Cholpady Vikram Kamath, Cassandra B. Rowan, Emma C. Grace, Matthew S. Goodwin, Marie J. Hayes, Rebecca A. Schwartz-Mette, Emily Zimmerman, Sarah Ostadabbas

    Abstract: We present an end-to-end computer vision pipeline to detect non-nutritive sucking (NNS) -- an infant sucking pattern with no nutrition delivered -- as a potential biomarker for developmental delays, using off-the-shelf baby monitor video footage. One barrier to clinical (or algorithmic) assessment of NNS stems from its sparsity, requiring experts to wade through hours of footage to find minutes of… ▽ More

    Submitted 29 March, 2023; originally announced March 2023.

  18. arXiv:2302.13861  [pdf, other

    cs.LG cs.CR cs.CV stat.ML

    Differentially Private Diffusion Models Generate Useful Synthetic Images

    Authors: Sahra Ghalebikesabi, Leonard Berrada, Sven Gowal, Ira Ktena, Robert Stanforth, Jamie Hayes, Soham De, Samuel L. Smith, Olivia Wiles, Borja Balle

    Abstract: The ability to generate privacy-preserving synthetic versions of sensitive image datasets could unlock numerous ML applications currently constrained by data availability. Due to their astonishing image generation quality, diffusion models are a prime candidate for generating high-quality synthetic data. However, recent studies have found that, by default, the outputs of some diffusion models do n… ▽ More

    Submitted 27 February, 2023; originally announced February 2023.

  19. arXiv:2302.09880  [pdf, other

    cs.LG cs.CR

    Towards Unbounded Machine Unlearning

    Authors: Meghdad Kurmanji, Peter Triantafillou, Jamie Hayes, Eleni Triantafillou

    Abstract: Deep machine unlearning is the problem of `removing' from a trained neural network a subset of its training set. This problem is very timely and has many applications, including the key tasks of removing biases (RB), resolving confusion (RC) (caused by mislabelled data in trained models), as well as allowing users to exercise their `right to be forgotten' to protect User Privacy (UP). This paper i… ▽ More

    Submitted 30 October, 2023; v1 submitted 20 February, 2023; originally announced February 2023.

  20. arXiv:2302.07956  [pdf, other

    cs.LG cs.CR

    Tight Auditing of Differentially Private Machine Learning

    Authors: Milad Nasr, Jamie Hayes, Thomas Steinke, Borja Balle, Florian Tramèr, Matthew Jagielski, Nicholas Carlini, Andreas Terzis

    Abstract: Auditing mechanisms for differential privacy use probabilistic means to empirically estimate the privacy level of an algorithm. For private machine learning, existing auditing mechanisms are tight: the empirical privacy estimate (nearly) matches the algorithm's provable privacy guarantee. But these auditing techniques suffer from two limitations. First, they only give tight estimates under implaus… ▽ More

    Submitted 15 February, 2023; originally announced February 2023.

  21. arXiv:2302.07225  [pdf, other

    cs.CR cs.LG

    Bounding Training Data Reconstruction in DP-SGD

    Authors: Jamie Hayes, Saeed Mahloujifar, Borja Balle

    Abstract: Differentially private training offers a protection which is usually interpreted as a guarantee against membership inference attacks. By proxy, this guarantee extends to other threats like reconstruction attacks attempting to extract complete training examples. Recent works provide evidence that if one does not need to protect against membership attacks but instead only wants to protect against tr… ▽ More

    Submitted 30 October, 2023; v1 submitted 14 February, 2023; originally announced February 2023.

    Comments: New experiments and comparison with related work

  22. arXiv:2301.13188  [pdf, other

    cs.CR cs.CV cs.LG

    Extracting Training Data from Diffusion Models

    Authors: Nicholas Carlini, Jamie Hayes, Milad Nasr, Matthew Jagielski, Vikash Sehwag, Florian Tramèr, Borja Balle, Daphne Ippolito, Eric Wallace

    Abstract: Image diffusion models such as DALL-E 2, Imagen, and Stable Diffusion have attracted significant attention due to their ability to generate high-quality synthetic images. In this work, we show that diffusion models memorize individual images from their training data and emit them at generation time. With a generate-and-filter pipeline, we extract over a thousand training examples from state-of-the… ▽ More

    Submitted 30 January, 2023; originally announced January 2023.

  23. Verifying term graph optimizations using Isabelle/HOL

    Authors: Brae J. Webb, Ian J. Hayes, Mark Utting

    Abstract: Our objective is to formally verify the correctness of the hundreds of expression optimization rules used within the GraalVM compiler. When defining the semantics of a programming language, expressions naturally form abstract syntax trees, or, terms. However, in order to facilitate sharing of common subexpressions, modern compilers represent expressions as term graphs. Defining the semantics of te… ▽ More

    Submitted 13 December, 2022; originally announced December 2022.

    Comments: 14 pages, 7 figures, to be published in CPP2023

  24. arXiv:2212.04001  [pdf, other

    cs.CL cs.LG

    TweetDrought: A Deep-Learning Drought Impacts Recognizer based on Twitter Data

    Authors: Beichen Zhang, Frank Schilder, Kelly Helm Smith, Michael J. Hayes, Sherri Harms, Tsegaye Tadesse

    Abstract: Acquiring a better understanding of drought impacts becomes increasingly vital under a warming climate. Traditional drought indices describe mainly biophysical variables and not impacts on social, economic, and environmental systems. We utilized natural language processing and bidirectional encoder representation from Transformers (BERT) based transfer learning to fine-tune the model on the data f… ▽ More

    Submitted 7 December, 2022; originally announced December 2022.

    Comments: 5 pages (+3 in appendix), 5 figures in appendix, 2 tables (+1 in appendix), ICML Workshop on Tackling Climate Change with Machine Learning Workshop, 2021

  25. arXiv:2212.01748  [pdf, other

    cs.LO cs.PL cs.SE

    Differential Testing of a Verification Framework for Compiler Optimizations (Experience Paper)

    Authors: Mark Utting, Brae J. Webb, Ian J. Hayes

    Abstract: We want to verify the correctness of optimization phases in the GraalVM compiler, which consist of many thousands of lines of complex Java code performing sophisticated graph transformations. We have built high-level models of the data structures and operations of the code using the Isabelle/HOL theorem prover, and can formally verify the correctness of those high-level operations. But the remaini… ▽ More

    Submitted 4 December, 2022; originally announced December 2022.

    Comments: 8 pages, 6 figures

  26. arXiv:2211.02768  [pdf, other

    cs.LG stat.AP

    Quantitative Assessment of Drought Impacts Using XGBoost based on the Drought Impact Reporter

    Authors: Beichen Zhang, Fatima K. Abu Salem, Michael J. Hayes, Tsegaye Tadesse

    Abstract: Under climate change, the increasing frequency, intensity, and spatial extent of drought events lead to higher socio-economic costs. However, the relationships between the hydro-meteorological indicators and drought impacts are not identified well yet because of the complexity and data scarcity. In this paper, we proposed a framework based on the extreme gradient model (XGBoost) for Texas to predi… ▽ More

    Submitted 4 November, 2022; originally announced November 2022.

    Comments: 4 pages with 2 figures and 1 table. NeurIPS workshop on Tackling Climate Change with Machine Learning, 2020

  27. Contextuality in distributed systems

    Authors: Nasos Evangelou-Oost, Callum Bannister, Ian J. Hayes

    Abstract: We present a lattice of distributed program specifications, whose ordering represents implementability/refinement. Specifications are modelled by families of subsets of relative execution traces, which encode the local orderings of state transitions, rather than their absolute timing according to a global clock. This is to overcome fundamental physical difficulties with synchronisation. The lattic… ▽ More

    Submitted 23 April, 2023; v1 submitted 17 October, 2022; originally announced October 2022.

    Comments: 22 pages

    Journal ref: In: Relational and Algebraic Methods in Computer Science. RAMiCS 2023. Lecture Notes in Computer Science, vol 13896. Springer, Cham (2023)

  28. arXiv:2210.08655  [pdf, other

    cs.LG cs.AI

    Evaluation of the Synthetic Electronic Health Records

    Authors: Emily Muller, Xu Zheng, Jer Hayes

    Abstract: Generative models have been found effective for data synthesis due to their ability to capture complex underlying data distributions. The quality of generated data from these models is commonly evaluated by visual inspection for image datasets or downstream analytical tasks for tabular datasets. These evaluation methods neither measure the implicit data distribution nor consider the data privacy i… ▽ More

    Submitted 16 October, 2022; originally announced October 2022.

    Comments: arXiv admin note: substantial text overlap with arXiv:2201.05400

  29. arXiv:2204.13650  [pdf, other

    cs.LG cs.CR cs.CV stat.ML

    Unlocking High-Accuracy Differentially Private Image Classification through Scale

    Authors: Soham De, Leonard Berrada, Jamie Hayes, Samuel L. Smith, Borja Balle

    Abstract: Differential Privacy (DP) provides a formal privacy guarantee preventing adversaries with access to a machine learning model from extracting information about individual training points. Differentially Private Stochastic Gradient Descent (DP-SGD), the most popular DP training method for deep learning, realizes this protection by injecting noise during training. However previous works have found th… ▽ More

    Submitted 16 June, 2022; v1 submitted 28 April, 2022; originally announced April 2022.

  30. arXiv:2201.05400  [pdf, other

    cs.LG cs.AI

    Synthesising Electronic Health Records: Cystic Fibrosis Patient Group

    Authors: Emily Muller, Xu Zheng, Jer Hayes

    Abstract: Class imbalance can often degrade predictive performance of supervised learning algorithms. Balanced classes can be obtained by oversampling exact copies, with noise, or interpolation between nearest neighbours (as in traditional SMOTE methods). Oversampling tabular data using augmentation, as is typical in computer vision tasks, can be achieved with deep generative models. Deep generative models… ▽ More

    Submitted 14 January, 2022; originally announced January 2022.

  31. arXiv:2201.04845  [pdf, other

    cs.CR cs.LG

    Reconstructing Training Data with Informed Adversaries

    Authors: Borja Balle, Giovanni Cherubin, Jamie Hayes

    Abstract: Given access to a machine learning model, can an adversary reconstruct the model's training data? This work studies this question from the lens of a powerful informed adversary who knows all the training data points except one. By instantiating concrete attacks, we show it is feasible to reconstruct the remaining data point in this stringent threat model. For convex models (e.g. logistic regressio… ▽ More

    Submitted 25 April, 2022; v1 submitted 13 January, 2022; originally announced January 2022.

    Comments: Published at "2022 IEEE Symposium on Security and Privacy (SP)"

  32. arXiv:2201.02265  [pdf, other

    cs.LG

    Learning to be adversarially robust and differentially private

    Authors: Jamie Hayes, Borja Balle, M. Pawan Kumar

    Abstract: We study the difficulties in learning that arise from robust and differentially private optimization. We first study convergence of gradient descent based adversarial training with differential privacy, taking a simple binary classification task on linearly separable data as an illustrative example. We compare the gap between adversarial and nominal risk in both private and non-private settings, s… ▽ More

    Submitted 6 January, 2022; originally announced January 2022.

    Comments: Preliminary work appeared at PPML 2021

  33. arXiv:2111.09085  [pdf, other

    cs.LG cs.AI cs.CR cs.SI

    Network Generation with Differential Privacy

    Authors: Xu Zheng, Nicholas McCarthy, Jer Hayes

    Abstract: We consider the problem of generating private synthetic versions of real-world graphs containing private information while maintaining the utility of generated graphs. Differential privacy is a gold standard for data privacy, and the introduction of the differentially private stochastic gradient descent (DP-SGD) algorithm has facilitated the training of private neural models in a number of domains… ▽ More

    Submitted 17 November, 2021; originally announced November 2021.

  34. arXiv:2111.09084  [pdf, other

    cs.AI

    A Graph-based Imputation Method for Sparse Medical Records

    Authors: Ramon Vinas, Xu Zheng, Jer Hayes

    Abstract: Electronic Medical Records (EHR) are extremely sparse. Only a small proportion of events (symptoms, diagnoses, and treatments) are observed in the lifetime of an individual. The high degree of missingness of EHR can be attributed to a large number of factors, including device failure, privacy concerns, or other unexpected reasons. Unfortunately, many traditional imputation methods are not well sui… ▽ More

    Submitted 17 November, 2021; originally announced November 2021.

  35. arXiv:2108.12326  [pdf

    cs.ET

    CeMux: Maximizing the Accuracy of Stochastic Mux Adders and an Application to Filter Design

    Authors: Timothy J. Baker, John P. Hayes

    Abstract: Stochastic computing (SC) is a low-cost computational paradigm that has promising applications in digital filter design, image processing and neural networks. Fundamental to these applications is the weighted addition operation which is most often implemented by a multiplexer (mux) tree. Mux-based adders have very low area but typically require long bit-streams to reach practical accuracy threshol… ▽ More

    Submitted 30 August, 2021; v1 submitted 27 August, 2021; originally announced August 2021.

    ACM Class: B.2

  36. A Formal Semantics of the GraalVM Intermediate Representation

    Authors: Brae J. Webb, Mark Utting, Ian J. Hayes

    Abstract: The optimization phase of a compiler is responsible for transforming an intermediate representation (IR) of a program into a more efficient form. Modern optimizers, such as that used in the GraalVM compiler, use an IR consisting of a sophisticated graph data structure that combines data flow and control flow into the one structure. As part of a wider project on the verification of optimization pas… ▽ More

    Submitted 5 July, 2021; originally announced July 2021.

    Comments: 16 pages, 8 figures, to be published to ATVA 2021

  37. arXiv:2103.15292  [pdf, other

    cs.LO

    Deriving Laws for Developing Concurrent Programs in a Rely-Guarantee Style

    Authors: Ian J. Hayes, Larissa A. Meinicke, Patrick A. Meiring

    Abstract: This paper presents a theory for the refinement of shared-memory concurrent algorithms from specifications. We augment pre and post condition specifications with Jones' rely and guarantee conditions, all of which are encoded as commands within a wide-spectrum language. Program components are specified using either partial or total correctness versions of postcondition specifications. Operations on… ▽ More

    Submitted 8 September, 2023; v1 submitted 28 March, 2021; originally announced March 2021.

    ACM Class: D.2.4; F.3.1; D.1.3

  38. arXiv:2011.07355  [pdf, other

    cs.LG cs.CR

    Towards transformation-resilient provenance detection of digital media

    Authors: Jamie Hayes, Krishnamurthy, Dvijotham, Yutian Chen, Sander Dieleman, Pushmeet Kohli, Norman Casagrande

    Abstract: Advancements in deep generative models have made it possible to synthesize images, videos and audio signals that are difficult to distinguish from natural signals, creating opportunities for potential abuse of these capabilities. This motivates the problem of tracking the provenance of signals, i.e., being able to determine the original source of a signal. Watermarking the signal at the time of si… ▽ More

    Submitted 14 November, 2020; originally announced November 2020.

  39. arXiv:2010.10294  [pdf, other

    cs.CR cs.LG

    Adaptive Webpage Fingerprinting from TLS Traces

    Authors: Vasilios Mavroudis, Jamie Hayes

    Abstract: In webpage fingerprinting, an on-path adversary infers the specific webpage loaded by a victim user by analysing the patterns in the encrypted TLS traffic exchanged between the user's browser and the website's servers. This work studies modern webpage fingerprinting adversaries against the TLS protocol; aiming to shed light on their capabilities and inform potential defences. Despite the importanc… ▽ More

    Submitted 27 October, 2023; v1 submitted 19 October, 2020; originally announced October 2020.

  40. arXiv:2009.13946  [pdf, other

    cs.LG q-bio.BM

    ChemoVerse: Manifold traversal of latent spaces for novel molecule discovery

    Authors: Harshdeep Singh, Nicholas McCarthy, Qurrat Ul Ain, Jeremiah Hayes

    Abstract: In order to design a more potent and effective chemical entity, it is essential to identify molecular structures with the desired chemical properties. Recent advances in generative models using neural networks and machine learning are being widely used by many emerging startups and researchers in this domain to design virtual libraries of drug-like compounds. Although these models can help a scien… ▽ More

    Submitted 29 September, 2020; originally announced September 2020.

    Comments: 5 pages, 2 figures, Presented in First workshop on Applied Deep Generative Networks - ECAI 2020 ("link for the workshop: https://sites.google.com/view/adgn-20/home")

  41. arXiv:2009.03561  [pdf, other

    cs.CR cs.AI

    Local and Central Differential Privacy for Robustness and Privacy in Federated Learning

    Authors: Mohammad Naseri, Jamie Hayes, Emiliano De Cristofaro

    Abstract: Federated Learning (FL) allows multiple participants to train machine learning models collaboratively by keeping their datasets local while only exchanging model updates. Alas, this is not necessarily free from privacy and robustness vulnerabilities, e.g., via membership, property, and backdoor attacks. This paper investigates whether and to what extent one can use differential Privacy (DP) to pro… ▽ More

    Submitted 27 May, 2022; v1 submitted 8 September, 2020; originally announced September 2020.

    Journal ref: Published in the Proceedings of the 29th Network and Distributed System Security Symposium (NDSS 2022)

  42. arXiv:2006.04622  [pdf, other

    cs.LG cs.CR stat.ML

    Trade-offs between membership privacy & adversarially robust learning

    Authors: Jamie Hayes

    Abstract: Historically, machine learning methods have not been designed with security in mind. In turn, this has given rise to adversarial examples, carefully perturbed input samples aimed to mislead detection at test time, which have been applied to attack spam and malware classification, and more recently to attack image classification. Consequently, an abundance of research has been devoted to designing… ▽ More

    Submitted 7 January, 2022; v1 submitted 8 June, 2020; originally announced June 2020.

  43. arXiv:2006.04208  [pdf, other

    cs.LG stat.ML

    Extensions and limitations of randomized smoothing for robustness guarantees

    Authors: Jamie Hayes

    Abstract: Randomized smoothing, a method to certify a classifier's decision on an input is invariant under adversarial noise, offers attractive advantages over other certification methods. It operates in a black-box and so certification is not constrained by the size of the classifier's architecture. Here, we extend the work of Li et al. \cite{li2018second}, studying how the choice of divergence between smo… ▽ More

    Submitted 7 June, 2020; originally announced June 2020.

    Comments: CVPR 2020 Workshop on Adversarial Machine Learning in Computer Vision

  44. arXiv:2006.03873  [pdf, other

    cs.LG cs.CR stat.ML

    Unique properties of adversarially trained linear classifiers on Gaussian data

    Authors: Jamie Hayes

    Abstract: Machine learning models are vulnerable to adversarial perturbations, that when added to an input, can cause high confidence misclassifications. The adversarial learning research community has made remarkable progress in the understanding of the root causes of adversarial perturbations. However, most problems that one may consider important to solve for the deployment of machine learning in safety… ▽ More

    Submitted 6 June, 2020; originally announced June 2020.

  45. arXiv:1910.05624  [pdf, other

    cs.RO cs.CL cs.HC

    A Research Platform for Multi-Robot Dialogue with Humans

    Authors: Matthew Marge, Stephen Nogar, Cory J. Hayes, Stephanie M. Lukin, Jesse Bloecker, Eric Holder, Clare Voss

    Abstract: This paper presents a research platform that supports spoken dialogue interaction with multiple robots. The demonstration showcases our crafted MultiBot testing scenario in which users can verbally issue search, navigate, and follow instructions to two robotic teammates: a simulated ground robot and an aerial robot. This flexible language and robotic platform takes advantage of existing tools for… ▽ More

    Submitted 12 October, 2019; originally announced October 2019.

    Comments: Accepted for publication at NAACL 2019; also presented at AI-HRI 2019 (arXiv:1909.04812)

    Report number: AI-HRI/2019/05

  46. arXiv:1907.04005  [pdf, ps, other

    cs.LO

    Handling localisation in rely/guarantee concurrency: An algebraic approach

    Authors: Larissa A. Meinicke, Ian J. Hayes

    Abstract: The rely/guarantee approach of Jones extends Hoare logic with rely and guarantee conditions in order to allow compositional reasoning about shared-variable concurrent programs. This paper focuses on localisation in the context of rely/guarantee concurrency in order to support local variables. Because we allow the body of a local variable block to contain component processes that run in parallel, t… ▽ More

    Submitted 9 July, 2019; originally announced July 2019.

    Comments: 16 pages

    MSC Class: 68Q85 ACM Class: F.3.1

  47. arXiv:1901.02402  [pdf, other

    cs.CR cs.LG

    Contamination Attacks and Mitigation in Multi-Party Machine Learning

    Authors: Jamie Hayes, Olga Ohrimenko

    Abstract: Machine learning is data hungry; the more data a model has access to in training, the more likely it is to perform well at inference time. Distinct parties may want to combine their local data to gain the benefits of a model trained on a large corpus of data. We consider such a case: parties get access to the model trained on their joint data but do not see each others individual datasets. We show… ▽ More

    Submitted 8 January, 2019; originally announced January 2019.

  48. arXiv:1811.06539  [pdf, ps, other

    cs.CR cs.LG

    A note on hyperparameters in black-box adversarial examples

    Authors: Jamie Hayes

    Abstract: Since Biggio et al. (2013) and Szegedy et al. (2013) first drew attention to adversarial examples, there has been a flood of research into defending and attacking machine learning models. However, almost all proposed attacks assume white-box access to a model. In other words, the attacker is assumed to have perfect knowledge of the models weights and architecture. With this insider knowledge, a wh… ▽ More

    Submitted 15 November, 2018; originally announced November 2018.

  49. arXiv:1810.10939  [pdf, other

    cs.LG cs.CR stat.ML

    Evading classifiers in discrete domains with provable optimality guarantees

    Authors: Bogdan Kulynych, Jamie Hayes, Nikita Samarin, Carmela Troncoso

    Abstract: Machine-learning models for security-critical applications such as bot, malware, or spam detection, operate in constrained discrete domains. These applications would benefit from having provable guarantees against adversarial examples. The existing literature on provable adversarial robustness of models, however, exclusively focuses on robustness to gradient-based attacks in domains such as images… ▽ More

    Submitted 1 July, 2019; v1 submitted 25 October, 2018; originally announced October 2018.

    Comments: NeurIPS 2018 Workshop on Security in Machine Learning

  50. Some Challenges of Specifying Concurrent Program Components

    Authors: Ian J. Hayes

    Abstract: The purpose of this paper is to address some of the challenges of formally specifying components of shared-memory concurrent programs. The focus is to provide an abstract specification of a component that is suitable for use both by clients of the component and as a starting point for refinement to an implementation of the component. We present some approaches to devising specifications, investiga… ▽ More

    Submitted 22 October, 2018; originally announced October 2018.

    Comments: In Proceedings Refine 2018, arXiv:1810.08739

    Journal ref: EPTCS 282, 2018, pp. 10-22