Skip to main content

Showing 1–12 of 12 results for author: Marchal, N

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.16895  [pdf, other

    cs.CY cs.AI

    (Unfair) Norms in Fairness Research: A Meta-Analysis

    Authors: Jennifer Chien, A. Stevie Bergman, Kevin R. McKee, Nenad Tomasev, Vinodkumar Prabhakaran, Rida Qadri, Nahema Marchal, William Isaac

    Abstract: Algorithmic fairness has emerged as a critical concern in artificial intelligence (AI) research. However, the development of fair AI systems is not an objective process. Fairness is an inherently subjective concept, shaped by the values, experiences, and identities of those involved in research and development. To better understand the norms and values embedded in current fairness research, we con… ▽ More

    Submitted 17 June, 2024; originally announced July 2024.

  2. arXiv:2407.12687  [pdf, other

    cs.CY cs.AI cs.LG

    Towards Responsible Development of Generative AI for Education: An Evaluation-Driven Approach

    Authors: Irina Jurenka, Markus Kunesch, Kevin R. McKee, Daniel Gillick, Shaojian Zhu, Sara Wiltberger, Shubham Milind Phal, Katherine Hermann, Daniel Kasenberg, Avishkar Bhoopchand, Ankit Anand, Miruna Pîslar, Stephanie Chan, Lisa Wang, Jennifer She, Parsa Mahmoudieh, Aliya Rysbek, Wei-Jen Ko, Andrea Huber, Brett Wiltshire, Gal Elidan, Roni Rabin, Jasmin Rubinovitz, Amit Pitaru, Mac McAllister , et al. (49 additional authors not shown)

    Abstract: A major challenge facing the world is the provision of equitable and universal access to quality education. Recent advances in generative AI (gen AI) have created excitement about the potential of new technologies to offer a personal tutor for every learner and a teaching assistant for every teacher. The full extent of this dream, however, has not yet materialised. We argue that this is primarily… ▽ More

    Submitted 19 July, 2024; v1 submitted 21 May, 2024; originally announced July 2024.

  3. arXiv:2406.13843  [pdf, other

    cs.AI

    Generative AI Misuse: A Taxonomy of Tactics and Insights from Real-World Data

    Authors: Nahema Marchal, Rachel Xu, Rasmi Elasmar, Iason Gabriel, Beth Goldberg, William Isaac

    Abstract: Generative, multimodal artificial intelligence (GenAI) offers transformative potential across industries, but its misuse poses significant risks. Prior research has shed light on the potential of advanced AI systems to be exploited for malicious purposes. However, we still lack a concrete understanding of how GenAI models are specifically exploited or abused in practice, including the tactics empl… ▽ More

    Submitted 21 June, 2024; v1 submitted 19 June, 2024; originally announced June 2024.

  4. arXiv:2406.11757  [pdf, other

    cs.AI cs.CL cs.CY cs.HC

    STAR: SocioTechnical Approach to Red Teaming Language Models

    Authors: Laura Weidinger, John Mellor, Bernat Guillen Pegueroles, Nahema Marchal, Ravin Kumar, Kristian Lum, Canfer Akbulut, Mark Diaz, Stevie Bergman, Mikel Rodriguez, Verena Rieser, William Isaac

    Abstract: This research introduces STAR, a sociotechnical framework that improves on current best practices for red teaming safety of large language models. STAR makes two key contributions: it enhances steerability by generating parameterised instructions for human red teamers, leading to improved coverage of the risk surface. Parameterised instructions also provide more detailed insights into model failur… ▽ More

    Submitted 6 August, 2024; v1 submitted 17 June, 2024; originally announced June 2024.

    Comments: 8 pages, 5 figures, 5 pages appendix. * denotes equal contribution

  5. arXiv:2404.16244  [pdf, other

    cs.CY

    The Ethics of Advanced AI Assistants

    Authors: Iason Gabriel, Arianna Manzini, Geoff Keeling, Lisa Anne Hendricks, Verena Rieser, Hasan Iqbal, Nenad Tomašev, Ira Ktena, Zachary Kenton, Mikel Rodriguez, Seliem El-Sayed, Sasha Brown, Canfer Akbulut, Andrew Trask, Edward Hughes, A. Stevie Bergman, Renee Shelby, Nahema Marchal, Conor Griffin, Juan Mateos-Garcia, Laura Weidinger, Winnie Street, Benjamin Lange, Alex Ingerman, Alison Lentz , et al. (32 additional authors not shown)

    Abstract: This paper focuses on the opportunities and the ethical and societal risks posed by advanced AI assistants. We define advanced AI assistants as artificial agents with natural language interfaces, whose function is to plan and execute sequences of actions on behalf of a user, across one or more domains, in line with the user's expectations. The paper starts by considering the technology itself, pro… ▽ More

    Submitted 28 April, 2024; v1 submitted 24 April, 2024; originally announced April 2024.

  6. arXiv:2404.15058  [pdf, other

    cs.CY cs.AI

    A Mechanism-Based Approach to Mitigating Harms from Persuasive Generative AI

    Authors: Seliem El-Sayed, Canfer Akbulut, Amanda McCroskery, Geoff Keeling, Zachary Kenton, Zaria Jalan, Nahema Marchal, Arianna Manzini, Toby Shevlane, Shannon Vallor, Daniel Susser, Matija Franklin, Sophie Bridgers, Harry Law, Matthew Rahtz, Murray Shanahan, Michael Henry Tessler, Arthur Douillard, Tom Everitt, Sasha Brown

    Abstract: Recent generative AI systems have demonstrated more advanced persuasive capabilities and are increasingly permeating areas of life where they can influence decision-making. Generative AI presents a new risk profile of persuasion due the opportunity for reciprocal exchange and prolonged interactions. This has led to growing concerns about harms from AI persuasion and how they can be mitigated, high… ▽ More

    Submitted 23 April, 2024; originally announced April 2024.

  7. arXiv:2310.11986  [pdf, other

    cs.AI cs.CL cs.CY

    Sociotechnical Safety Evaluation of Generative AI Systems

    Authors: Laura Weidinger, Maribeth Rauh, Nahema Marchal, Arianna Manzini, Lisa Anne Hendricks, Juan Mateos-Garcia, Stevie Bergman, Jackie Kay, Conor Griffin, Ben Bariach, Iason Gabriel, Verena Rieser, William Isaac

    Abstract: Generative AI systems produce a range of risks. To ensure the safety of generative AI systems, these risks must be evaluated. In this paper, we make two main contributions toward establishing such evaluations. First, we propose a three-layered framework that takes a structured, sociotechnical approach to evaluating these risks. This framework encompasses capability evaluations, which are the main… ▽ More

    Submitted 31 October, 2023; v1 submitted 18 October, 2023; originally announced October 2023.

    Comments: main paper p.1-29, 5 figures, 2 tables

  8. arXiv:2305.15324  [pdf, other

    cs.AI

    Model evaluation for extreme risks

    Authors: Toby Shevlane, Sebastian Farquhar, Ben Garfinkel, Mary Phuong, Jess Whittlestone, Jade Leung, Daniel Kokotajlo, Nahema Marchal, Markus Anderljung, Noam Kolt, Lewis Ho, Divya Siddarth, Shahar Avin, Will Hawkins, Been Kim, Iason Gabriel, Vijay Bolina, Jack Clark, Yoshua Bengio, Paul Christiano, Allan Dafoe

    Abstract: Current approaches to building general-purpose AI systems tend to produce systems with both beneficial and harmful capabilities. Further progress in AI development could lead to capabilities that pose extreme risks, such as offensive cyber capabilities or strong manipulation skills. We explain why model evaluation is critical for addressing extreme risks. Developers must be able to identify danger… ▽ More

    Submitted 22 September, 2023; v1 submitted 24 May, 2023; originally announced May 2023.

    Comments: Fixed typos; added citation

    ACM Class: K.4.1

  9. arXiv:2206.10062  [pdf, other

    cs.RO cs.AI

    Early Recall, Late Precision: Multi-Robot Semantic Object Mapping under Operational Constraints in Perceptually-Degraded Environments

    Authors: Xianmei Lei, Taeyeon Kim, Nicolas Marchal, Daniel Pastor, Barry Ridge, Frederik Schöller, Edward Terry, Fernando Chavez, Thomas Touma, Kyohei Otsu, Ali Agha

    Abstract: Semantic object mapping in uncertain, perceptually degraded environments during long-range multi-robot autonomous exploration tasks such as search-and-rescue is important and challenging. During such missions, high recall is desirable to avoid missing true target objects and high precision is also critical to avoid wasting valuable operational time on false positives. Given recent advancements in… ▽ More

    Submitted 20 June, 2022; originally announced June 2022.

  10. arXiv:2002.12069  [pdf

    cs.SI

    Junk News & Information Sharing During the 2019 UK General Election

    Authors: Nahema Marchal, Bence Kollanyi, Lisa-Maria Neudert, Hubert Au, Philip N. Howard

    Abstract: Today, an estimated 75% of the British public access information about politics and public life online, and 40% do so via social media. With this context in mind, we investigate information sharing patterns over social media in the lead-up to the 2019 UK General Elections, and ask: (1) What type of political news and information were social media users sharing on Twitter ahead of the vote? (2) How… ▽ More

    Submitted 27 February, 2020; originally announced February 2020.

  11. arXiv:2001.11461  [pdf

    cs.SI stat.AP

    Echo Chambers Exist! (But They're Full of Opposing Views)

    Authors: Jonathan Bright, Nahema Marchal, Bharath Ganesh, Stevan Rudinac

    Abstract: The theory of echo chambers, which suggests that online political discussions take place in conditions of ideological homogeneity, has recently gained popularity as an explanation for patterns of political polarization and radicalization observed in many democratic countries. However, while micro-level experimental work has shown evidence that individuals may gravitate towards information that sup… ▽ More

    Submitted 30 January, 2020; originally announced January 2020.

  12. Learning Densities in Feature Space for Reliable Segmentation of Indoor Scenes

    Authors: Nicolas Marchal, Charlotte Moraldo, Roland Siegwart, Hermann Blum, Cesar Cadena, Abel Gawel

    Abstract: Deep learning has enabled remarkable advances in scene understanding, particularly in semantic segmentation tasks. Yet, current state of the art approaches are limited to a closed set of classes, and fail when facing novel elements, also known as out of distribution (OoD) data. This is a problem as autonomous agents will inevitably come across a wide range of objects, all of which cannot be includ… ▽ More

    Submitted 13 January, 2020; v1 submitted 1 August, 2019; originally announced August 2019.

    Comments: Preprint version after acceptance of publication in the IEEE robotics and automation letters

    Journal ref: IEEE Robotics and Automation Letters, vol. 5, no. 2, pp. 1032-1038, April 2020