Skip to main content

Showing 1–24 of 24 results for author: Birhane, A

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.08790  [pdf, other

    cs.CL cs.CY

    Large Models of What? Mistaking Engineering Achievements for Human Linguistic Agency

    Authors: Abeba Birhane, Marek McGann

    Abstract: In this paper we argue that key, often sensational and misleading, claims regarding linguistic capabilities of Large Language Models (LLMs) are based on at least two unfounded assumptions; the assumption of language completeness and the assumption of data completeness. Language completeness assumes that a distinct and complete thing such as `a natural language' exists, the essential characteristic… ▽ More

    Submitted 11 July, 2024; originally announced July 2024.

    Comments: To appear in the Journal of Language Sciences

  2. The Dark Side of Dataset Scaling: Evaluating Racial Classification in Multimodal Models

    Authors: Abeba Birhane, Sepehr Dehdashtian, Vinay Uday Prabhu, Vishnu Boddeti

    Abstract: Scale the model, scale the data, scale the GPU farms is the reigning sentiment in the world of generative AI today. While model scaling has been extensively studied, data scaling and its downstream impacts on model performance remain under-explored. This is particularly important in the context of multimodal datasets whose main source is the World Wide Web, condensed and packaged as the Common Cra… ▽ More

    Submitted 7 May, 2024; originally announced May 2024.

    Comments: To appear in the proceedings of the 2024 ACM Conference on Fairness, Accountability, and Transparency (FAccT 24), June 3 to 6, 2024, Rio de Janeiro, Brazil. arXiv admin note: text overlap with arXiv:2306.13141

  3. arXiv:2404.10072  [pdf, other

    cs.CY

    Debunking Robot Rights Metaphysically, Ethically, and Legally

    Authors: Abeba Birhane, Jelle van Dijk, Frank Pasquale

    Abstract: In this work we challenge arguments for robot rights on metaphysical, ethical and legal grounds. Metaphysically, we argue that machines are not the kinds of things that may be denied or granted rights. Building on theories of phenomenology and post-Cartesian approaches to cognitive science, we ground our position in the lived reality of actual humans in an increasingly ubiquitously connected, cont… ▽ More

    Submitted 15 April, 2024; originally announced April 2024.

    Comments: Published in First Monday special issue entitled "Ideologies of AI and the consolidation of power"

  4. arXiv:2402.17861  [pdf, other

    cs.CY

    Towards AI Accountability Infrastructure: Gaps and Opportunities in AI Audit Tooling

    Authors: Victor Ojewale, Ryan Steed, Briana Vecchione, Abeba Birhane, Inioluwa Deborah Raji

    Abstract: Audits are critical mechanisms for identifying the risks and limitations of deployed artificial intelligence (AI) systems. However, the effective execution of AI audits remains incredibly difficult. As a result, practitioners make use of various tools to support their efforts. Drawing on interviews with 35 AI audit practitioners and a landscape analysis of 390 tools, we map the current ecosystem o… ▽ More

    Submitted 14 March, 2024; v1 submitted 27 February, 2024; originally announced February 2024.

  5. arXiv:2401.14462  [pdf, other

    cs.CY

    AI auditing: The Broken Bus on the Road to AI Accountability

    Authors: Abeba Birhane, Ryan Steed, Victor Ojewale, Briana Vecchione, Inioluwa Deborah Raji

    Abstract: One of the most concrete measures to take towards meaningful AI accountability is to consequentially assess and report the systems' performance and impact. However, the practical nature of the "AI audit" ecosystem is muddled and imprecise, making it difficult to work through various concepts and map out the stakeholders involved in the practice. First, we taxonomize current AI audit practices as c… ▽ More

    Submitted 25 January, 2024; originally announced January 2024.

    Comments: To appear in the proceedings of the 2nd IEEE Conference on Secure and Trustworthy Machine Learning (SaTML) 2024

  6. arXiv:2311.03449  [pdf, other

    cs.CY

    Into the LAIONs Den: Investigating Hate in Multimodal Datasets

    Authors: Abeba Birhane, Vinay Prabhu, Sang Han, Vishnu Naresh Boddeti, Alexandra Sasha Luccioni

    Abstract: 'Scale the model, scale the data, scale the compute' is the reigning sentiment in the world of generative AI today. While the impact of model scaling has been extensively studied, we are only beginning to scratch the surface of data scaling and its consequences. This is especially of critical importance in the context of vision-language datasets such as LAION. These datasets are continually growin… ▽ More

    Submitted 6 November, 2023; originally announced November 2023.

    Comments: To appear at 37th Conference on Neural Information Processing Systems (NeurIPS 2023) Datasets and Benchmarks Track. arXiv admin note: substantial text overlap with arXiv:2306.13141

  7. arXiv:2309.15084  [pdf, other

    cs.CV cs.CY

    The Surveillance AI Pipeline

    Authors: Pratyusha Ria Kalluri, William Agnew, Myra Cheng, Kentrell Owens, Luca Soldaini, Abeba Birhane

    Abstract: A rapidly growing number of voices argue that AI research, and computer vision in particular, is powering mass surveillance. Yet the direct path from computer vision research to surveillance has remained obscured and difficult to assess. Here, we reveal the Surveillance AI pipeline by analyzing three decades of computer vision research papers and downstream patents, more than 40,000 documents. We… ▽ More

    Submitted 17 October, 2023; v1 submitted 26 September, 2023; originally announced September 2023.

  8. arXiv:2306.13141  [pdf, other

    cs.CY

    On Hate Scaling Laws For Data-Swamps

    Authors: Abeba Birhane, Vinay Prabhu, Sang Han, Vishnu Naresh Boddeti

    Abstract: `Scale the model, scale the data, scale the GPU-farms' is the reigning sentiment in the world of generative AI today. While model scaling has been extensively studied, data scaling and its downstream impacts remain under explored. This is especially of critical importance in the context of visio-linguistic datasets whose main source is the World Wide Web, condensed and packaged as the CommonCrawl… ▽ More

    Submitted 28 June, 2023; v1 submitted 22 June, 2023; originally announced June 2023.

  9. arXiv:2301.08559  [pdf, other

    q-bio.OT cs.LG nlin.AO physics.bio-ph

    The Lost Art of Mathematical Modelling

    Authors: Linnéa Gyllingberg, Abeba Birhane, David J. T. Sumpter

    Abstract: We provide a critique of mathematical biology in light of rapid developments in modern machine learning. We argue that out of the three modelling activities -- (1) formulating models; (2) analysing models; and (3) fitting or comparing models to data -- inherent to mathematical biology, researchers currently focus too much on activity (2) at the cost of (1). This trend, we propose, can be reversed… ▽ More

    Submitted 2 June, 2023; v1 submitted 19 January, 2023; originally announced January 2023.

    MSC Class: 92B05

  10. Power to the People? Opportunities and Challenges for Participatory AI

    Authors: Abeba Birhane, William Isaac, Vinodkumar Prabhakaran, Mark Díaz, Madeleine Clare Elish, Iason Gabriel, Shakir Mohamed

    Abstract: Participatory approaches to artificial intelligence (AI) and machine learning (ML) are gaining momentum: the increased attention comes partly with the view that participation opens the gateway to an inclusive, equitable, robust, responsible and trustworthy AI.Among other benefits, participatory approaches are essential to understanding and adequately representing the needs, desires and perspective… ▽ More

    Submitted 15 September, 2022; originally announced September 2022.

    Comments: To appear in the proceeding of EAAMO 2022

  11. arXiv:2206.04179  [pdf, other

    cs.CY

    Automating Ambiguity: Challenges and Pitfalls of Artificial Intelligence

    Authors: Abeba Birhane

    Abstract: Machine learning (ML) and artificial intelligence (AI) tools increasingly permeate every possible social, political, and economic sphere; sorting, taxonomizing and predicting complex human behaviour and social phenomena. However, from fallacious and naive groundings regarding complex adaptive systems to datasets underlying models, these systems are beset by problems, challenges, and limitations. T… ▽ More

    Submitted 8 June, 2022; originally announced June 2022.

    Comments: PhD thesis

  12. arXiv:2205.08922  [pdf, other

    cs.CY

    The games we play: critical complexity improves machine learning

    Authors: Abeba Birhane, David J. T. Sumpter

    Abstract: When mathematical modelling is applied to capture a complex system, multiple models are often created that characterize different aspects of that system. Often, a model at one level will produce a prediction which is contradictory at another level but both models are accepted because they are both useful. Rather than aiming to build a single unified model of a complex system, the modeller acknowle… ▽ More

    Submitted 18 May, 2022; originally announced May 2022.

    Comments: To appear in the HHAI 2022 conference proceedings

  13. The Forgotten Margins of AI Ethics

    Authors: Abeba Birhane, Elayne Ruane, Thomas Laurent, Matthew S. Brown, Johnathan Flowers, Anthony Ventresque, Christopher L. Dancy

    Abstract: How has recent AI Ethics literature addressed topics such as fairness and justice in the context of continued social and structural power asymmetries? We trace both the historical roots and current landmark work that have been shaping the field and categorize these works under three broad umbrellas: (i) those grounded in Western canonical philosophy, (ii) mathematical and statistical methods, and… ▽ More

    Submitted 9 May, 2022; originally announced May 2022.

    Comments: To appear in the FAccT 2022 proceedings

  14. arXiv:2204.14256  [pdf, other

    cs.CL

    Handling and Presenting Harmful Text in NLP Research

    Authors: Hannah Rose Kirk, Abeba Birhane, Bertie Vidgen, Leon Derczynski

    Abstract: Text data can pose a risk of harm. However, the risks are not fully understood, and how to handle, present, and discuss harmful text in a safe way remains an unresolved issue in the NLP community. We provide an analytical framework categorising harms on three axes: (1) the harm type (e.g., misinformation, hate speech or racial stereotypes); (2) whether a harm is \textit{sought} as a feature of the… ▽ More

    Submitted 24 February, 2023; v1 submitted 29 April, 2022; originally announced April 2022.

    Comments: in Findings of EMNLP 2022

  15. arXiv:2204.03100  [pdf

    cs.CY cs.HC cs.LG

    Data Justice Stories: A Repository of Case Studies

    Authors: David Leslie, Morgan Briggs, Antonella Perini, Smera Jayadeva, Cami Rincón, Noopur Raval, Abeba Birhane, Rosamund Powell, Michael Katell, Mhairi Aitken

    Abstract: The idea of "data justice" is of recent academic vintage. It has arisen over the past decade in Anglo-European research institutions as an attempt to bring together a critique of the power dynamics that underlie accelerating trends of datafication with a normative commitment to the principles of social justice-a commitment to the achievement of a society that is equitable, fair, and capable of con… ▽ More

    Submitted 6 April, 2022; originally announced April 2022.

  16. arXiv:2204.03090  [pdf

    cs.CY cs.AI cs.GL cs.HC

    Advancing Data Justice Research and Practice: An Integrated Literature Review

    Authors: David Leslie, Michael Katell, Mhairi Aitken, Jatinder Singh, Morgan Briggs, Rosamund Powell, Cami Rincón, Thompson Chengeta, Abeba Birhane, Antonella Perini, Smera Jayadeva, Anjali Mazumder

    Abstract: The Advancing Data Justice Research and Practice (ADJRP) project aims to widen the lens of current thinking around data justice and to provide actionable resources that will help policymakers, practitioners, and impacted communities gain a broader understanding of what equitable, freedom-promoting, and rights-sustaining data collection, governance, and use should look like in increasingly dynamic… ▽ More

    Submitted 6 April, 2022; originally announced April 2022.

  17. arXiv:2112.04359  [pdf, other

    cs.CL cs.AI cs.CY

    Ethical and social risks of harm from Language Models

    Authors: Laura Weidinger, John Mellor, Maribeth Rauh, Conor Griffin, Jonathan Uesato, Po-Sen Huang, Myra Cheng, Mia Glaese, Borja Balle, Atoosa Kasirzadeh, Zac Kenton, Sasha Brown, Will Hawkins, Tom Stepleton, Courtney Biles, Abeba Birhane, Julia Haas, Laura Rimell, Lisa Anne Hendricks, William Isaac, Sean Legassick, Geoffrey Irving, Iason Gabriel

    Abstract: This paper aims to help structure the risk landscape associated with large-scale Language Models (LMs). In order to foster advances in responsible innovation, an in-depth understanding of the potential risks posed by these models is needed. A wide range of established and anticipated risks are analysed in detail, drawing on multidisciplinary expertise and literature from computer science, linguist… ▽ More

    Submitted 8 December, 2021; originally announced December 2021.

  18. arXiv:2110.01963  [pdf, other

    cs.CY

    Multimodal datasets: misogyny, pornography, and malignant stereotypes

    Authors: Abeba Birhane, Vinay Uday Prabhu, Emmanuel Kahembwe

    Abstract: We have now entered the era of trillion parameter machine learning models trained on billion-sized datasets scraped from the internet. The rise of these gargantuan datasets has given rise to formidable bodies of critical work that has called for caution while generating these large datasets. These address concerns surrounding the dubious curation practices used to generate these datasets, the sord… ▽ More

    Submitted 5 October, 2021; originally announced October 2021.

    Comments: 33 pages

  19. arXiv:2106.15590  [pdf, other

    cs.LG cs.AI cs.CY

    The Values Encoded in Machine Learning Research

    Authors: Abeba Birhane, Pratyusha Kalluri, Dallas Card, William Agnew, Ravit Dotan, Michelle Bao

    Abstract: Machine learning currently exerts an outsized influence on the world, increasingly affecting institutional practices and impacted communities. It is therefore critical that we question vague conceptions of the field as value-neutral or universally beneficial, and investigate what specific values the field is advancing. In this paper, we first introduce a method and annotation scheme for studying t… ▽ More

    Submitted 21 June, 2022; v1 submitted 29 June, 2021; originally announced June 2021.

    Comments: Data and code available at https://github.com/wagnew3/The-Values-Encoded-in-Machine-Learning-Research. arXiv admin note: text overlap with arXiv:2206.04179

  20. Narratives and Counternarratives on Data Sharing in Africa

    Authors: Rediet Abebe, Kehinde Aruleba, Abeba Birhane, Sara Kingsley, George Obaido, Sekou L. Remy, Swathi Sadagopan

    Abstract: As machine learning and data science applications grow ever more prevalent, there is an increased focus on data sharing and open data initiatives, particularly in the context of the African continent. Many argue that data sharing can support research and policy design to alleviate poverty, inequality, and derivative effects in Africa. Despite the fact that the datasets in question are often extrac… ▽ More

    Submitted 1 March, 2021; originally announced March 2021.

  21. arXiv:2009.14258  [pdf, other

    cs.CY

    Towards decolonising computational sciences

    Authors: Abeba Birhane, Olivia Guest

    Abstract: This article sets out our perspective on how to begin the journey of decolonising computational fields, such as data and cognitive sciences. We see this struggle as requiring two basic steps: a) realisation that the present-day system has inherited, and still enacts, hostile, conservative, and oppressive behaviours and principles towards women of colour (WoC); and b) rejection of the idea that cen… ▽ More

    Submitted 29 September, 2020; originally announced September 2020.

    Comments: A version of this work will appear in the Danish Journal of Women, Gender and Research (https://koensforskning.soc.ku.dk/english/kkof/) in December 2020

  22. arXiv:2006.16923  [pdf, other

    cs.CY stat.AP stat.ML

    Large image datasets: A pyrrhic win for computer vision?

    Authors: Vinay Uday Prabhu, Abeba Birhane

    Abstract: In this paper we investigate problematic practices and consequences of large scale vision datasets. We examine broad issues such as the question of consent and justice as well as specific concerns such as the inclusion of verifiably pornographic images in datasets. Taking the ImageNet-ILSVRC-2012 dataset as an example, we perform a cross-sectional model-based quantitative census covering factors s… ▽ More

    Submitted 23 July, 2020; v1 submitted 24 June, 2020; originally announced June 2020.

    Comments: Github: https://github.com/vinayprabhu/Dataset_audits. Update on July 23rd: (1) Added in the supplementary section (2) The curators of the Tiny Images dataset decided to withdraw the dataset in response to the previous version of this paper, a change that has duly been reflected in this version. Their statement: https://groups.csail.mit.edu/vision/TinyImages/

  23. Robot Rights? Let's Talk about Human Welfare Instead

    Authors: Abeba Birhane, Jelle van Dijk

    Abstract: The 'robot rights' debate, and its related question of 'robot responsibility', invokes some of the most polarized positions in AI ethics. While some advocate for granting robots rights on a par with human beings, others, in a stark opposition argue that robots are not deserving of rights but are objects that should be our slaves. Grounded in post-Cartesian philosophical foundations, we argue not j… ▽ More

    Submitted 14 January, 2020; originally announced January 2020.

    Comments: Accepted to the AIES 2020 conference in New York, February 2020. The final version of this paper will appear in Proceedings of the 2020 AAAI/ACM Conference on AI, Ethics, and Society

  24. arXiv:1912.07376  [pdf, ps, other

    cs.CY

    Algorithmic Injustices: Towards a Relational Ethics

    Authors: Abeba Birhane, Fred Cummins

    Abstract: It has become trivial to point out how decision-making processes in various social, political and economical sphere are assisted by automated systems. Improved efficiency, the hallmark of these systems, drives the mass scale integration of automated systems into daily life. However, as a robust body of research in the area of algorithmic injustice shows, algorithmic tools embed and perpetuate soci… ▽ More

    Submitted 16 December, 2019; originally announced December 2019.

    Comments: Presented at the Black in AI workshop, @NeurIPS2019