Search | arXiv e-print repository

Large Models of What? Mistaking Engineering Achievements for Human Linguistic Agency

Abstract: In this paper we argue that key, often sensational and misleading, claims regarding linguistic capabilities of Large Language Models (LLMs) are based on at least two unfounded assumptions; the assumption of language completeness and the assumption of data completeness. Language completeness assumes that a distinct and complete thing such as `a natural language' exists, the essential characteristic… ▽ More In this paper we argue that key, often sensational and misleading, claims regarding linguistic capabilities of Large Language Models (LLMs) are based on at least two unfounded assumptions; the assumption of language completeness and the assumption of data completeness. Language completeness assumes that a distinct and complete thing such as `a natural language' exists, the essential characteristics of which can be effectively and comprehensively modelled by an LLM. The assumption of data completeness relies on the belief that a language can be quantified and wholly captured by data. Work within the enactive approach to cognitive science makes clear that, rather than a distinct and complete thing, language is a means or way of acting. Languaging is not the kind of thing that can admit of a complete or comprehensive modelling. From an enactive perspective we identify three key characteristics of enacted language; embodiment, participation, and precariousness, that are absent in LLMs, and likely incompatible in principle with current architectures. We argue that these absences imply that LLMs are not now and cannot in their present form be linguistic agents the way humans are. We illustrate the point in particular through the phenomenon of `algospeak', a recently described pattern of high stakes human language activity in heavily controlled online environments. On the basis of these points, we conclude that sensational and misleading claims about LLM agency and capabilities emerge from a deep misconception of both what human language is and what LLMs are. △ Less

Submitted 11 July, 2024; originally announced July 2024.

Comments: To appear in the Journal of Language Sciences

arXiv:2405.04623 [pdf, other]

doi 10.1145/3630106.3658968

The Dark Side of Dataset Scaling: Evaluating Racial Classification in Multimodal Models

Authors: Abeba Birhane, Sepehr Dehdashtian, Vinay Uday Prabhu, Vishnu Boddeti

Abstract: Scale the model, scale the data, scale the GPU farms is the reigning sentiment in the world of generative AI today. While model scaling has been extensively studied, data scaling and its downstream impacts on model performance remain under-explored. This is particularly important in the context of multimodal datasets whose main source is the World Wide Web, condensed and packaged as the Common Cra… ▽ More Scale the model, scale the data, scale the GPU farms is the reigning sentiment in the world of generative AI today. While model scaling has been extensively studied, data scaling and its downstream impacts on model performance remain under-explored. This is particularly important in the context of multimodal datasets whose main source is the World Wide Web, condensed and packaged as the Common Crawl dump, which is known to exhibit numerous drawbacks. In this paper, we evaluate the downstream impact of dataset scaling on 14 visio-linguistic models (VLMs) trained on the LAION400-M and LAION-2B datasets by measuring racial and gender bias using the Chicago Face Dataset (CFD) as the probe. Our results show that as the training data increased, the probability of a pre-trained CLIP model misclassifying human images as offensive non-human classes such as chimpanzee, gorilla, and orangutan decreased, but misclassifying the same images as human offensive classes such as criminal increased. Furthermore, of the 14 Vision Transformer-based VLMs we evaluated, the probability of predicting an image of a Black man and a Latino man as criminal increases by 65% and 69%, respectively, when the dataset is scaled from 400M to 2B samples for the larger ViT-L models. Conversely, for the smaller base ViT-B models, the probability of predicting an image of a Black man and a Latino man as criminal decreases by 20% and 47%, respectively, when the dataset is scaled from 400M to 2B samples. We ground the model audit results in a qualitative and historical analysis, reflect on our findings and their implications for dataset curation practice, and close with a summary of mitigation mechanisms and ways forward. Content warning: This article contains racially dehumanising and offensive descriptions. △ Less

Submitted 7 May, 2024; originally announced May 2024.

Comments: To appear in the proceedings of the 2024 ACM Conference on Fairness, Accountability, and Transparency (FAccT 24), June 3 to 6, 2024, Rio de Janeiro, Brazil. arXiv admin note: text overlap with arXiv:2306.13141

arXiv:2404.10072 [pdf, other]

Debunking Robot Rights Metaphysically, Ethically, and Legally

Authors: Abeba Birhane, Jelle van Dijk, Frank Pasquale

Abstract: In this work we challenge arguments for robot rights on metaphysical, ethical and legal grounds. Metaphysically, we argue that machines are not the kinds of things that may be denied or granted rights. Building on theories of phenomenology and post-Cartesian approaches to cognitive science, we ground our position in the lived reality of actual humans in an increasingly ubiquitously connected, cont… ▽ More In this work we challenge arguments for robot rights on metaphysical, ethical and legal grounds. Metaphysically, we argue that machines are not the kinds of things that may be denied or granted rights. Building on theories of phenomenology and post-Cartesian approaches to cognitive science, we ground our position in the lived reality of actual humans in an increasingly ubiquitously connected, controlled, digitized, and surveilled society. Ethically, we argue that, given machines current and potential harms to the most marginalized in society, limits on (rather than rights for) machines should be at the centre of current AI ethics debate. From a legal perspective, the best analogy to robot rights is not human rights but corporate rights, a highly controversial concept whose most important effect has been the undermining of worker, consumer, and voter rights by advancing the power of capital to exercise outsized influence on politics and law. The idea of robot rights, we conclude, acts as a smoke screen, allowing theorists and futurists to fantasize about benevolently sentient machines with unalterable needs and desires protected by law. While such fantasies have motivated fascinating fiction and art, once they influence legal theory and practice articulating the scope of rights claims, they threaten to immunize from legal accountability the current AI and robotics that is fuelling surveillance capitalism, accelerating environmental destruction, and entrenching injustice and human suffering. △ Less

Submitted 15 April, 2024; originally announced April 2024.

Comments: Published in First Monday special issue entitled "Ideologies of AI and the consolidation of power"

arXiv:2402.17861 [pdf, other]

Towards AI Accountability Infrastructure: Gaps and Opportunities in AI Audit Tooling

Authors: Victor Ojewale, Ryan Steed, Briana Vecchione, Abeba Birhane, Inioluwa Deborah Raji

Abstract: Audits are critical mechanisms for identifying the risks and limitations of deployed artificial intelligence (AI) systems. However, the effective execution of AI audits remains incredibly difficult. As a result, practitioners make use of various tools to support their efforts. Drawing on interviews with 35 AI audit practitioners and a landscape analysis of 390 tools, we map the current ecosystem o… ▽ More Audits are critical mechanisms for identifying the risks and limitations of deployed artificial intelligence (AI) systems. However, the effective execution of AI audits remains incredibly difficult. As a result, practitioners make use of various tools to support their efforts. Drawing on interviews with 35 AI audit practitioners and a landscape analysis of 390 tools, we map the current ecosystem of available AI audit tools. While there are many tools designed to assist practitioners with setting standards and evaluating AI systems, these tools often fell short of supporting the accountability goals of AI auditing in practice. We thus highlight areas for future tool development beyond evaluation -- from harms discovery to advocacy -- and outline challenges practitioners faced in their efforts to use AI audit tools. We conclude that resources are lacking to adequately support the full scope of needs for many AI audit practitioners and recommend that the field move beyond tools for just evaluation, towards more comprehensive infrastructure for AI accountability. △ Less

Submitted 14 March, 2024; v1 submitted 27 February, 2024; originally announced February 2024.

arXiv:2401.14462 [pdf, other]

AI auditing: The Broken Bus on the Road to AI Accountability

Authors: Abeba Birhane, Ryan Steed, Victor Ojewale, Briana Vecchione, Inioluwa Deborah Raji

Abstract: One of the most concrete measures to take towards meaningful AI accountability is to consequentially assess and report the systems' performance and impact. However, the practical nature of the "AI audit" ecosystem is muddled and imprecise, making it difficult to work through various concepts and map out the stakeholders involved in the practice. First, we taxonomize current AI audit practices as c… ▽ More One of the most concrete measures to take towards meaningful AI accountability is to consequentially assess and report the systems' performance and impact. However, the practical nature of the "AI audit" ecosystem is muddled and imprecise, making it difficult to work through various concepts and map out the stakeholders involved in the practice. First, we taxonomize current AI audit practices as completed by regulators, law firms, civil society, journalism, academia, consulting agencies. Next, we assess the impact of audits done by stakeholders within each domain. We find that only a subset of AI audit studies translate to desired accountability outcomes. We thus assess and isolate practices necessary for effective AI audit results, articulating the observed connections between AI audit design, methodology and institutional context on its effectiveness as a meaningful mechanism for accountability. △ Less

Submitted 25 January, 2024; originally announced January 2024.

Comments: To appear in the proceedings of the 2nd IEEE Conference on Secure and Trustworthy Machine Learning (SaTML) 2024

arXiv:2311.03449 [pdf, other]

Into the LAIONs Den: Investigating Hate in Multimodal Datasets

Authors: Abeba Birhane, Vinay Prabhu, Sang Han, Vishnu Naresh Boddeti, Alexandra Sasha Luccioni

Abstract: 'Scale the model, scale the data, scale the compute' is the reigning sentiment in the world of generative AI today. While the impact of model scaling has been extensively studied, we are only beginning to scratch the surface of data scaling and its consequences. This is especially of critical importance in the context of vision-language datasets such as LAION. These datasets are continually growin… ▽ More 'Scale the model, scale the data, scale the compute' is the reigning sentiment in the world of generative AI today. While the impact of model scaling has been extensively studied, we are only beginning to scratch the surface of data scaling and its consequences. This is especially of critical importance in the context of vision-language datasets such as LAION. These datasets are continually growing in size and are built based on large-scale internet dumps such as the Common Crawl, which is known to have numerous drawbacks ranging from quality, legality, and content. The datasets then serve as the backbone for large generative models, contributing to the operationalization and perpetuation of harmful societal and historical biases and stereotypes. In this paper, we investigate the effect of scaling datasets on hateful content through a comparative audit of two datasets: LAION-400M and LAION-2B. Our results show that hate content increased by nearly 12% with dataset scale, measured both qualitatively and quantitatively using a metric that we term as Hate Content Rate (HCR). We also found that filtering dataset contents based on Not Safe For Work (NSFW) values calculated based on images alone does not exclude all the harmful content in alt-text. Instead, we found that trace amounts of hateful, targeted, and aggressive text remain even when carrying out conservative filtering. We end with a reflection and a discussion of the significance of our results for dataset curation and usage in the AI community. Code and the meta-data assets curated in this paper are publicly available at https://github.com/vinayprabhu/hate_scaling. Content warning: This paper contains examples of hateful text that might be disturbing, distressing, and/or offensive. △ Less

Submitted 6 November, 2023; originally announced November 2023.

Comments: To appear at 37th Conference on Neural Information Processing Systems (NeurIPS 2023) Datasets and Benchmarks Track. arXiv admin note: substantial text overlap with arXiv:2306.13141

arXiv:2309.15084 [pdf, other]

The Surveillance AI Pipeline

Authors: Pratyusha Ria Kalluri, William Agnew, Myra Cheng, Kentrell Owens, Luca Soldaini, Abeba Birhane

Abstract: A rapidly growing number of voices argue that AI research, and computer vision in particular, is powering mass surveillance. Yet the direct path from computer vision research to surveillance has remained obscured and difficult to assess. Here, we reveal the Surveillance AI pipeline by analyzing three decades of computer vision research papers and downstream patents, more than 40,000 documents. We… ▽ More A rapidly growing number of voices argue that AI research, and computer vision in particular, is powering mass surveillance. Yet the direct path from computer vision research to surveillance has remained obscured and difficult to assess. Here, we reveal the Surveillance AI pipeline by analyzing three decades of computer vision research papers and downstream patents, more than 40,000 documents. We find the large majority of annotated computer vision papers and patents self-report their technology enables extracting data about humans. Moreover, the majority of these technologies specifically enable extracting data about human bodies and body parts. We present both quantitative and rich qualitative analysis illuminating these practices of human data extraction. Studying the roots of this pipeline, we find that institutions that prolifically produce computer vision research, namely elite universities and "big tech" corporations, are subsequently cited in thousands of surveillance patents. Further, we find consistent evidence against the narrative that only these few rogue entities are contributing to surveillance. Rather, we expose the fieldwide norm that when an institution, nation, or subfield authors computer vision papers with downstream patents, the majority of these papers are used in surveillance patents. In total, we find the number of papers with downstream surveillance patents increased more than five-fold between the 1990s and the 2010s, with computer vision research now having been used in more than 11,000 surveillance patents. Finally, in addition to the high levels of surveillance we find documented in computer vision papers and patents, we unearth pervasive patterns of documents using language that obfuscates the extent of surveillance. Our analysis reveals the pipeline by which computer vision research has powered the ongoing expansion of surveillance. △ Less

Submitted 17 October, 2023; v1 submitted 26 September, 2023; originally announced September 2023.

arXiv:2306.13141 [pdf, other]

On Hate Scaling Laws For Data-Swamps

Authors: Abeba Birhane, Vinay Prabhu, Sang Han, Vishnu Naresh Boddeti

Abstract: `Scale the model, scale the data, scale the GPU-farms' is the reigning sentiment in the world of generative AI today. While model scaling has been extensively studied, data scaling and its downstream impacts remain under explored. This is especially of critical importance in the context of visio-linguistic datasets whose main source is the World Wide Web, condensed and packaged as the CommonCrawl… ▽ More `Scale the model, scale the data, scale the GPU-farms' is the reigning sentiment in the world of generative AI today. While model scaling has been extensively studied, data scaling and its downstream impacts remain under explored. This is especially of critical importance in the context of visio-linguistic datasets whose main source is the World Wide Web, condensed and packaged as the CommonCrawl dump. This large scale data-dump, which is known to have numerous drawbacks, is repeatedly mined and serves as the data-motherlode for large generative models. In this paper, we: 1) investigate the effect of scaling datasets on hateful content through a comparative audit of the LAION-400M and LAION-2B-en, containing 400 million and 2 billion samples respectively, and 2) evaluate the downstream impact of scale on visio-linguistic models trained on these dataset variants by measuring racial bias of the models trained on them using the Chicago Face Dataset (CFD) as a probe. Our results show that 1) the presence of hateful content in datasets, when measured with a Hate Content Rate (HCR) metric on the inferences of the Pysentimiento hate-detection Natural Language Processing (NLP) model, increased by nearly $12\%$ and 2) societal biases and negative stereotypes were also exacerbated with scale on the models we evaluated. As scale increased, the tendency of the model to associate images of human faces with the `human being' class over 7 other offensive classes reduced by half. Furthermore, for the Black female category, the tendency of the model to associate their faces with the `criminal' class doubled, while quintupling for Black male faces. We present a qualitative and historical analysis of the model audit results, reflect on our findings and its implications for dataset curation practice, and close with a summary of our findings and potential future work to be done in this area. △ Less

Submitted 28 June, 2023; v1 submitted 22 June, 2023; originally announced June 2023.

arXiv:2301.08559 [pdf, other]

The Lost Art of Mathematical Modelling

Authors: Linnéa Gyllingberg, Abeba Birhane, David J. T. Sumpter

Abstract: We provide a critique of mathematical biology in light of rapid developments in modern machine learning. We argue that out of the three modelling activities -- (1) formulating models; (2) analysing models; and (3) fitting or comparing models to data -- inherent to mathematical biology, researchers currently focus too much on activity (2) at the cost of (1). This trend, we propose, can be reversed… ▽ More We provide a critique of mathematical biology in light of rapid developments in modern machine learning. We argue that out of the three modelling activities -- (1) formulating models; (2) analysing models; and (3) fitting or comparing models to data -- inherent to mathematical biology, researchers currently focus too much on activity (2) at the cost of (1). This trend, we propose, can be reversed by realising that any given biological phenomena can be modelled in an infinite number of different ways, through the adoption of an open/pluralistic approach. We explain the open approach using fish locomotion as a case study and illustrate some of the pitfalls -- universalism, creating models of models, etc. -- that hinder mathematical biology. We then ask how we might rediscover a lost art: that of creative mathematical modelling. This article is dedicated to the memory of Edmund Crampin. △ Less

Submitted 2 June, 2023; v1 submitted 19 January, 2023; originally announced January 2023.

MSC Class: 92B05

arXiv:2209.07572 [pdf, ps, other]

doi 10.1145/3551624.3555290

Power to the People? Opportunities and Challenges for Participatory AI

Authors: Abeba Birhane, William Isaac, Vinodkumar Prabhakaran, Mark Díaz, Madeleine Clare Elish, Iason Gabriel, Shakir Mohamed

Abstract: Participatory approaches to artificial intelligence (AI) and machine learning (ML) are gaining momentum: the increased attention comes partly with the view that participation opens the gateway to an inclusive, equitable, robust, responsible and trustworthy AI.Among other benefits, participatory approaches are essential to understanding and adequately representing the needs, desires and perspective… ▽ More Participatory approaches to artificial intelligence (AI) and machine learning (ML) are gaining momentum: the increased attention comes partly with the view that participation opens the gateway to an inclusive, equitable, robust, responsible and trustworthy AI.Among other benefits, participatory approaches are essential to understanding and adequately representing the needs, desires and perspectives of historically marginalized communities. However, there currently exists lack of clarity on what meaningful participation entails and what it is expected to do. In this paper we first review participatory approaches as situated in historical contexts as well as participatory methods and practices within the AI and ML pipeline. We then introduce three case studies in participatory AI.Participation holds the potential for beneficial, emancipatory and empowering technology design, development and deployment while also being at risk for concerns such as cooptation and conflation with other activities. We lay out these limitations and concerns and argue that as participatory AI/ML becomes in vogue, a contextual and nuanced understanding of the term as well as consideration of who the primary beneficiaries of participatory activities ought to be constitute crucial factors to realizing the benefits and opportunities that participation brings. △ Less

Submitted 15 September, 2022; originally announced September 2022.

Comments: To appear in the proceeding of EAAMO 2022

arXiv:2206.04179 [pdf, other]

Automating Ambiguity: Challenges and Pitfalls of Artificial Intelligence

Authors: Abeba Birhane

Abstract: Machine learning (ML) and artificial intelligence (AI) tools increasingly permeate every possible social, political, and economic sphere; sorting, taxonomizing and predicting complex human behaviour and social phenomena. However, from fallacious and naive groundings regarding complex adaptive systems to datasets underlying models, these systems are beset by problems, challenges, and limitations. T… ▽ More Machine learning (ML) and artificial intelligence (AI) tools increasingly permeate every possible social, political, and economic sphere; sorting, taxonomizing and predicting complex human behaviour and social phenomena. However, from fallacious and naive groundings regarding complex adaptive systems to datasets underlying models, these systems are beset by problems, challenges, and limitations. They remain opaque and unreliable, and fail to consider societal and structural oppressive systems, disproportionately negatively impacting those at the margins of society while benefiting the most powerful. The various challenges, problems and pitfalls of these systems are a hot topic of research in various areas, such as critical data/algorithm studies, science and technology studies (STS), embodied and enactive cognitive science, complexity science, Afro-feminism, and the broadly construed emerging field of Fairness, Accountability, and Transparency (FAccT). Yet, these fields of enquiry often proceed in silos. This thesis weaves together seemingly disparate fields of enquiry to examine core scientific and ethical challenges, pitfalls, and problems of AI. In this thesis I, a) review the historical and cultural ecology from which AI research emerges, b) examine the shaky scientific grounds of machine prediction of complex behaviour illustrating how predicting complex behaviour with precision is impossible in principle, c) audit large scale datasets behind current AI demonstrating how they embed societal historical and structural injustices, d) study the seemingly neutral values of ML research and put forward 67 prominent values underlying ML research, e) examine some of the insidious and worrying applications of computer vision research, and f) put forward a framework for approaching challenges, failures and problems surrounding ML systems as well as alternative ways forward. △ Less

Submitted 8 June, 2022; originally announced June 2022.

Comments: PhD thesis

arXiv:2205.08922 [pdf, other]

The games we play: critical complexity improves machine learning

Authors: Abeba Birhane, David J. T. Sumpter

Abstract: When mathematical modelling is applied to capture a complex system, multiple models are often created that characterize different aspects of that system. Often, a model at one level will produce a prediction which is contradictory at another level but both models are accepted because they are both useful. Rather than aiming to build a single unified model of a complex system, the modeller acknowle… ▽ More When mathematical modelling is applied to capture a complex system, multiple models are often created that characterize different aspects of that system. Often, a model at one level will produce a prediction which is contradictory at another level but both models are accepted because they are both useful. Rather than aiming to build a single unified model of a complex system, the modeller acknowledges the infinity of ways of capturing the system of interest, while offering their own specific insight. We refer to this pragmatic applied approach to complex systems -- one which acknowledges that they are incompressible, dynamic, nonlinear, historical, contextual, and value-laden -- as Open Machine Learning (Open ML). In this paper we define Open ML and contrast it with some of the grand narratives of ML of two forms: 1) Closed ML, ML which emphasizes learning with minimal human input (e.g. Google's AlphaZero) and 2) Partially Open ML, ML which is used to parameterize existing models. To achieve this, we use theories of critical complexity to both evaluate these grand narratives and contrast them with the Open ML approach. Specifically, we deconstruct grand ML `theories' by identifying thirteen 'games' played in the ML community. These games lend false legitimacy to models, contribute to over-promise and hype about the capabilities of artificial intelligence, reduce wider participation in the subject, lead to models that exacerbate inequality and cause discrimination and ultimately stifle creativity in research. We argue that best practice in ML should be more consistent with critical complexity perspectives than with rationalist, grand narratives. △ Less

Submitted 18 May, 2022; originally announced May 2022.

Comments: To appear in the HHAI 2022 conference proceedings

arXiv:2205.04221 [pdf, other]

doi 10.1145/3531146.3533157

The Forgotten Margins of AI Ethics

Authors: Abeba Birhane, Elayne Ruane, Thomas Laurent, Matthew S. Brown, Johnathan Flowers, Anthony Ventresque, Christopher L. Dancy

Abstract: How has recent AI Ethics literature addressed topics such as fairness and justice in the context of continued social and structural power asymmetries? We trace both the historical roots and current landmark work that have been shaping the field and categorize these works under three broad umbrellas: (i) those grounded in Western canonical philosophy, (ii) mathematical and statistical methods, and… ▽ More How has recent AI Ethics literature addressed topics such as fairness and justice in the context of continued social and structural power asymmetries? We trace both the historical roots and current landmark work that have been shaping the field and categorize these works under three broad umbrellas: (i) those grounded in Western canonical philosophy, (ii) mathematical and statistical methods, and (iii) those emerging from critical data/algorithm/information studies. We also survey the field and explore emerging trends by examining the rapidly growing body of literature that falls under the broad umbrella of AI Ethics. To that end, we read and annotated peer-reviewed papers published over the past four years in two premier conferences: FAccT and AIES. We organize the literature based on an annotation scheme we developed according to three main dimensions: whether the paper deals with concrete applications, use-cases, and/or people's lived experience; to what extent it addresses harmed, threatened, or otherwise marginalized groups; and if so, whether it explicitly names such groups. We note that although the goals of the majority of FAccT and AIES papers were often commendable, their consideration of the negative impacts of AI on traditionally marginalized groups remained shallow. Taken together, our conceptual analysis and the data from annotated papers indicate that the field would benefit from an increased focus on ethical analysis grounded in concrete use-cases, people's experiences, and applications as well as from approaches that are sensitive to structural and historical power asymmetries. △ Less

Submitted 9 May, 2022; originally announced May 2022.

Comments: To appear in the FAccT 2022 proceedings

arXiv:2204.14256 [pdf, other]

Handling and Presenting Harmful Text in NLP Research

Authors: Hannah Rose Kirk, Abeba Birhane, Bertie Vidgen, Leon Derczynski

Abstract: Text data can pose a risk of harm. However, the risks are not fully understood, and how to handle, present, and discuss harmful text in a safe way remains an unresolved issue in the NLP community. We provide an analytical framework categorising harms on three axes: (1) the harm type (e.g., misinformation, hate speech or racial stereotypes); (2) whether a harm is \textit{sought} as a feature of the… ▽ More Text data can pose a risk of harm. However, the risks are not fully understood, and how to handle, present, and discuss harmful text in a safe way remains an unresolved issue in the NLP community. We provide an analytical framework categorising harms on three axes: (1) the harm type (e.g., misinformation, hate speech or racial stereotypes); (2) whether a harm is \textit{sought} as a feature of the research design if explicitly studying harmful content (e.g., training a hate speech classifier), versus \textit{unsought} if harmful content is encountered when working on unrelated problems (e.g., language generation or part-of-speech tagging); and (3) who it affects, from people (mis)represented in the data to those handling the data and those publishing on the data. We provide advice for practitioners, with concrete steps for mitigating harm in research and in publication. To assist implementation we introduce \textsc{HarmCheck} -- a documentation standard for handling and presenting harmful text in research. △ Less

Submitted 24 February, 2023; v1 submitted 29 April, 2022; originally announced April 2022.

Comments: in Findings of EMNLP 2022

arXiv:2204.03100 [pdf]

doi 10.5281/zenodo.6408326

Data Justice Stories: A Repository of Case Studies

Authors: David Leslie, Morgan Briggs, Antonella Perini, Smera Jayadeva, Cami Rincón, Noopur Raval, Abeba Birhane, Rosamund Powell, Michael Katell, Mhairi Aitken

Abstract: The idea of "data justice" is of recent academic vintage. It has arisen over the past decade in Anglo-European research institutions as an attempt to bring together a critique of the power dynamics that underlie accelerating trends of datafication with a normative commitment to the principles of social justice-a commitment to the achievement of a society that is equitable, fair, and capable of con… ▽ More The idea of "data justice" is of recent academic vintage. It has arisen over the past decade in Anglo-European research institutions as an attempt to bring together a critique of the power dynamics that underlie accelerating trends of datafication with a normative commitment to the principles of social justice-a commitment to the achievement of a society that is equitable, fair, and capable of confronting the root causes of injustice.However, despite the seeming novelty of such a data justice pedigree, this joining up of the critique of the power imbalances that have shaped the digital and "big data" revolutions with a commitment to social equity and constructive societal transformation has a deeper historical, and more geographically diverse, provenance. As the stories of the data justice initiatives, activism, and advocacy contained in this volume well evidence, practices of data justice across the globe have, in fact, largely preceded the elaboration and crystallisation of the idea of data justice in contemporary academic discourse. In telling these data justice stories, we hope to provide the reader with two interdependent tools of data justice thinking: First, we aim to provide the reader with the critical leverage needed to discern those distortions and malformations of data justice that manifest in subtle and explicit forms of power, domination, and coercion. Second, we aim to provide the reader with access to the historically effective forms of normativity and ethical insight that have been marshalled by data justice activists and advocates as tools of societal transformation-so that these forms of normativity and insight can be drawn on, in turn, as constructive resources to spur future transformative data justice practices. △ Less

Submitted 6 April, 2022; originally announced April 2022.

arXiv:2204.03090 [pdf]

doi 10.5281/zenodo.6408304

Advancing Data Justice Research and Practice: An Integrated Literature Review

Authors: David Leslie, Michael Katell, Mhairi Aitken, Jatinder Singh, Morgan Briggs, Rosamund Powell, Cami Rincón, Thompson Chengeta, Abeba Birhane, Antonella Perini, Smera Jayadeva, Anjali Mazumder

Abstract: The Advancing Data Justice Research and Practice (ADJRP) project aims to widen the lens of current thinking around data justice and to provide actionable resources that will help policymakers, practitioners, and impacted communities gain a broader understanding of what equitable, freedom-promoting, and rights-sustaining data collection, governance, and use should look like in increasingly dynamic… ▽ More The Advancing Data Justice Research and Practice (ADJRP) project aims to widen the lens of current thinking around data justice and to provide actionable resources that will help policymakers, practitioners, and impacted communities gain a broader understanding of what equitable, freedom-promoting, and rights-sustaining data collection, governance, and use should look like in increasingly dynamic and global data innovation ecosystems. In this integrated literature review we hope to lay the conceptual groundwork needed to support this aspiration. The introduction motivates the broadening of data justice that is undertaken by the literature review which follows. First, we address how certain limitations of the current study of data justice drive the need for a re-location of data justice research and practice. We map out the strengths and shortcomings of the contemporary state of the art and then elaborate on the challenges faced by our own effort to broaden the data justice perspective in the decolonial context. The body of the literature review covers seven thematic areas. For each theme, the ADJRP team has systematically collected and analysed key texts in order to tell the critical empirical story of how existing social structures and power dynamics present challenges to data justice and related justice fields. In each case, this critical empirical story is also supplemented by the transformational story of how activists, policymakers, and academics are challenging longstanding structures of inequity to advance social justice in data innovation ecosystems and adjacent areas of technological practice. △ Less

Submitted 6 April, 2022; originally announced April 2022.

arXiv:2112.04359 [pdf, other]

Ethical and social risks of harm from Language Models

Authors: Laura Weidinger, John Mellor, Maribeth Rauh, Conor Griffin, Jonathan Uesato, Po-Sen Huang, Myra Cheng, Mia Glaese, Borja Balle, Atoosa Kasirzadeh, Zac Kenton, Sasha Brown, Will Hawkins, Tom Stepleton, Courtney Biles, Abeba Birhane, Julia Haas, Laura Rimell, Lisa Anne Hendricks, William Isaac, Sean Legassick, Geoffrey Irving, Iason Gabriel

Abstract: This paper aims to help structure the risk landscape associated with large-scale Language Models (LMs). In order to foster advances in responsible innovation, an in-depth understanding of the potential risks posed by these models is needed. A wide range of established and anticipated risks are analysed in detail, drawing on multidisciplinary expertise and literature from computer science, linguist… ▽ More This paper aims to help structure the risk landscape associated with large-scale Language Models (LMs). In order to foster advances in responsible innovation, an in-depth understanding of the potential risks posed by these models is needed. A wide range of established and anticipated risks are analysed in detail, drawing on multidisciplinary expertise and literature from computer science, linguistics, and social sciences. We outline six specific risk areas: I. Discrimination, Exclusion and Toxicity, II. Information Hazards, III. Misinformation Harms, V. Malicious Uses, V. Human-Computer Interaction Harms, VI. Automation, Access, and Environmental Harms. The first area concerns the perpetuation of stereotypes, unfair discrimination, exclusionary norms, toxic language, and lower performance by social group for LMs. The second focuses on risks from private data leaks or LMs correctly inferring sensitive information. The third addresses risks arising from poor, false or misleading information including in sensitive domains, and knock-on risks such as the erosion of trust in shared information. The fourth considers risks from actors who try to use LMs to cause harm. The fifth focuses on risks specific to LLMs used to underpin conversational agents that interact with human users, including unsafe use, manipulation or deception. The sixth discusses the risk of environmental harm, job automation, and other challenges that may have a disparate effect on different social groups or communities. In total, we review 21 risks in-depth. We discuss the points of origin of different risks and point to potential mitigation approaches. Lastly, we discuss organisational responsibilities in implementing mitigations, and the role of collaboration and participation. We highlight directions for further research, particularly on expanding the toolkit for assessing and evaluating the outlined risks in LMs. △ Less

Submitted 8 December, 2021; originally announced December 2021.

arXiv:2110.01963 [pdf, other]

Multimodal datasets: misogyny, pornography, and malignant stereotypes

Authors: Abeba Birhane, Vinay Uday Prabhu, Emmanuel Kahembwe

Abstract: We have now entered the era of trillion parameter machine learning models trained on billion-sized datasets scraped from the internet. The rise of these gargantuan datasets has given rise to formidable bodies of critical work that has called for caution while generating these large datasets. These address concerns surrounding the dubious curation practices used to generate these datasets, the sord… ▽ More We have now entered the era of trillion parameter machine learning models trained on billion-sized datasets scraped from the internet. The rise of these gargantuan datasets has given rise to formidable bodies of critical work that has called for caution while generating these large datasets. These address concerns surrounding the dubious curation practices used to generate these datasets, the sordid quality of alt-text data available on the world wide web, the problematic content of the CommonCrawl dataset often used as a source for training large language models, and the entrenched biases in large-scale visio-linguistic models (such as OpenAI's CLIP model) trained on opaque datasets (WebImageText). In the backdrop of these specific calls of caution, we examine the recently released LAION-400M dataset, which is a CLIP-filtered dataset of Image-Alt-text pairs parsed from the Common-Crawl dataset. We found that the dataset contains, troublesome and explicit images and text pairs of rape, pornography, malign stereotypes, racist and ethnic slurs, and other extremely problematic content. We outline numerous implications, concerns and downstream harms regarding the current state of large scale datasets while raising open questions for various stakeholders including the AI community, regulators, policy makers and data subjects. △ Less

Submitted 5 October, 2021; originally announced October 2021.

Comments: 33 pages

arXiv:2106.15590 [pdf, other]

The Values Encoded in Machine Learning Research

Authors: Abeba Birhane, Pratyusha Kalluri, Dallas Card, William Agnew, Ravit Dotan, Michelle Bao

Abstract: Machine learning currently exerts an outsized influence on the world, increasingly affecting institutional practices and impacted communities. It is therefore critical that we question vague conceptions of the field as value-neutral or universally beneficial, and investigate what specific values the field is advancing. In this paper, we first introduce a method and annotation scheme for studying t… ▽ More Machine learning currently exerts an outsized influence on the world, increasingly affecting institutional practices and impacted communities. It is therefore critical that we question vague conceptions of the field as value-neutral or universally beneficial, and investigate what specific values the field is advancing. In this paper, we first introduce a method and annotation scheme for studying the values encoded in documents such as research papers. Applying the scheme, we analyze 100 highly cited machine learning papers published at premier machine learning conferences, ICML and NeurIPS. We annotate key features of papers which reveal their values: their justification for their choice of project, which attributes of their project they uplift, their consideration of potential negative consequences, and their institutional affiliations and funding sources. We find that few of the papers justify how their project connects to a societal need (15\%) and far fewer discuss negative potential (1\%). Through line-by-line content analysis, we identify 59 values that are uplifted in ML research, and, of these, we find that the papers most frequently justify and assess themselves based on Performance, Generalization, Quantitative evidence, Efficiency, Building on past work, and Novelty. We present extensive textual evidence and identify key themes in the definitions and operationalization of these values. Notably, we find systematic textual evidence that these top values are being defined and applied with assumptions and implications generally supporting the centralization of power.Finally, we find increasingly close ties between these highly cited papers and tech companies and elite universities. △ Less

Submitted 21 June, 2022; v1 submitted 29 June, 2021; originally announced June 2021.

Comments: Data and code available at https://github.com/wagnew3/The-Values-Encoded-in-Machine-Learning-Research. arXiv admin note: text overlap with arXiv:2206.04179

arXiv:2103.01168 [pdf, other]

doi 10.1145/3442188.3445897

Narratives and Counternarratives on Data Sharing in Africa

Authors: Rediet Abebe, Kehinde Aruleba, Abeba Birhane, Sara Kingsley, George Obaido, Sekou L. Remy, Swathi Sadagopan

Abstract: As machine learning and data science applications grow ever more prevalent, there is an increased focus on data sharing and open data initiatives, particularly in the context of the African continent. Many argue that data sharing can support research and policy design to alleviate poverty, inequality, and derivative effects in Africa. Despite the fact that the datasets in question are often extrac… ▽ More As machine learning and data science applications grow ever more prevalent, there is an increased focus on data sharing and open data initiatives, particularly in the context of the African continent. Many argue that data sharing can support research and policy design to alleviate poverty, inequality, and derivative effects in Africa. Despite the fact that the datasets in question are often extracted from African communities, conversations around the challenges of accessing and sharing African data are too often driven by nonAfrican stakeholders. These perspectives frequently employ a deficit narratives, often focusing on lack of education, training, and technological resources in the continent as the leading causes of friction in the data ecosystem. We argue that these narratives obfuscate and distort the full complexity of the African data sharing landscape. In particular, we use storytelling via fictional personas built from a series of interviews with African data experts to complicate dominant narratives and to provide counternarratives. Coupling these personas with research on data practices within the continent, we identify recurring barriers to data sharing as well as inequities in the distribution of data sharing benefits. In particular, we discuss issues arising from power imbalances resulting from the legacies of colonialism, ethno-centrism, and slavery, disinvestment in building trust, lack of acknowledgement of historical and present-day extractive practices, and Western-centric policies that are ill-suited to the African context. After outlining these problems, we discuss avenues for addressing them when sharing data generated in the continent. △ Less

Submitted 1 March, 2021; originally announced March 2021.

arXiv:2009.14258 [pdf, other]

Towards decolonising computational sciences

Authors: Abeba Birhane, Olivia Guest

Abstract: This article sets out our perspective on how to begin the journey of decolonising computational fields, such as data and cognitive sciences. We see this struggle as requiring two basic steps: a) realisation that the present-day system has inherited, and still enacts, hostile, conservative, and oppressive behaviours and principles towards women of colour (WoC); and b) rejection of the idea that cen… ▽ More This article sets out our perspective on how to begin the journey of decolonising computational fields, such as data and cognitive sciences. We see this struggle as requiring two basic steps: a) realisation that the present-day system has inherited, and still enacts, hostile, conservative, and oppressive behaviours and principles towards women of colour (WoC); and b) rejection of the idea that centering individual people is a solution to system-level problems. The longer we ignore these two steps, the more "our" academic system maintains its toxic structure, excludes, and harms Black women and other minoritised groups. This also keeps the door open to discredited pseudoscience, like eugenics and physiognomy. We propose that grappling with our fields' histories and heritage holds the key to avoiding mistakes of the past. For example, initiatives such as "diversity boards" can still be harmful because they superficially appear reformatory but nonetheless center whiteness and maintain the status quo. Building on the shoulders of many WoC's work, who have been paving the way, we hope to advance the dialogue required to build both a grass-roots and a top-down re-imagining of computational sciences -- including but not limited to psychology, neuroscience, cognitive science, computer science, data science, statistics, machine learning, and artificial intelligence. We aspire for these fields to progress away from their stagnant, sexist, and racist shared past into carving and maintaining an ecosystem where both a diverse demographics of researchers and scientific ideas that critically challenge the status quo are welcomed. △ Less

Submitted 29 September, 2020; originally announced September 2020.

Comments: A version of this work will appear in the Danish Journal of Women, Gender and Research (https://koensforskning.soc.ku.dk/english/kkof/) in December 2020

arXiv:2006.16923 [pdf, other]

Large image datasets: A pyrrhic win for computer vision?

Authors: Vinay Uday Prabhu, Abeba Birhane

Abstract: In this paper we investigate problematic practices and consequences of large scale vision datasets. We examine broad issues such as the question of consent and justice as well as specific concerns such as the inclusion of verifiably pornographic images in datasets. Taking the ImageNet-ILSVRC-2012 dataset as an example, we perform a cross-sectional model-based quantitative census covering factors s… ▽ More In this paper we investigate problematic practices and consequences of large scale vision datasets. We examine broad issues such as the question of consent and justice as well as specific concerns such as the inclusion of verifiably pornographic images in datasets. Taking the ImageNet-ILSVRC-2012 dataset as an example, we perform a cross-sectional model-based quantitative census covering factors such as age, gender, NSFW content scoring, class-wise accuracy, human-cardinality-analysis, and the semanticity of the image class information in order to statistically investigate the extent and subtleties of ethical transgressions. We then use the census to help hand-curate a look-up-table of images in the ImageNet-ILSVRC-2012 dataset that fall into the categories of verifiably pornographic: shot in a non-consensual setting (up-skirt), beach voyeuristic, and exposed private parts. We survey the landscape of harm and threats both society broadly and individuals face due to uncritical and ill-considered dataset curation practices. We then propose possible courses of correction and critique the pros and cons of these. We have duly open-sourced all of the code and the census meta-datasets generated in this endeavor for the computer vision community to build on. By unveiling the severity of the threats, our hope is to motivate the constitution of mandatory Institutional Review Boards (IRB) for large scale dataset curation processes. △ Less

Submitted 23 July, 2020; v1 submitted 24 June, 2020; originally announced June 2020.

Comments: Github: https://github.com/vinayprabhu/Dataset_audits. Update on July 23rd: (1) Added in the supplementary section (2) The curators of the Tiny Images dataset decided to withdraw the dataset in response to the previous version of this paper, a change that has duly been reflected in this version. Their statement: https://groups.csail.mit.edu/vision/TinyImages/

arXiv:2001.05046 [pdf, ps, other]

doi 10.1145/3375627.3375855

Robot Rights? Let's Talk about Human Welfare Instead

Authors: Abeba Birhane, Jelle van Dijk

Abstract: The 'robot rights' debate, and its related question of 'robot responsibility', invokes some of the most polarized positions in AI ethics. While some advocate for granting robots rights on a par with human beings, others, in a stark opposition argue that robots are not deserving of rights but are objects that should be our slaves. Grounded in post-Cartesian philosophical foundations, we argue not j… ▽ More The 'robot rights' debate, and its related question of 'robot responsibility', invokes some of the most polarized positions in AI ethics. While some advocate for granting robots rights on a par with human beings, others, in a stark opposition argue that robots are not deserving of rights but are objects that should be our slaves. Grounded in post-Cartesian philosophical foundations, we argue not just to deny robots 'rights', but to deny that robots, as artifacts emerging out of and mediating human being, are the kinds of things that could be granted rights in the first place. Once we see robots as mediators of human being, we can understand how the `robots rights' debate is focused on first world problems, at the expense of urgent ethical concerns, such as machine bias, machine elicited human labour exploitation, and erosion of privacy all impacting society's least privileged individuals. We conclude that, if human being is our starting point and human welfare is the primary concern, the negative impacts emerging from machinic systems, as well as the lack of taking responsibility by people designing, selling and deploying such machines, remains the most pressing ethical discussion in AI. △ Less

Submitted 14 January, 2020; originally announced January 2020.

Comments: Accepted to the AIES 2020 conference in New York, February 2020. The final version of this paper will appear in Proceedings of the 2020 AAAI/ACM Conference on AI, Ethics, and Society

arXiv:1912.07376 [pdf, ps, other]

Algorithmic Injustices: Towards a Relational Ethics

Authors: Abeba Birhane, Fred Cummins

Abstract: It has become trivial to point out how decision-making processes in various social, political and economical sphere are assisted by automated systems. Improved efficiency, the hallmark of these systems, drives the mass scale integration of automated systems into daily life. However, as a robust body of research in the area of algorithmic injustice shows, algorithmic tools embed and perpetuate soci… ▽ More It has become trivial to point out how decision-making processes in various social, political and economical sphere are assisted by automated systems. Improved efficiency, the hallmark of these systems, drives the mass scale integration of automated systems into daily life. However, as a robust body of research in the area of algorithmic injustice shows, algorithmic tools embed and perpetuate societal and historical biases and injustice. In particular, a persistent recurring trend within the literature indicates that society's most vulnerable are disproportionally impacted. When algorithmic injustice and bias is brought to the fore, most of the solutions on offer 1) revolve around technical solutions and 2) do not focus centre disproportionally impacted groups. This paper zooms out and draws the bigger picture. It 1) argues that concerns surrounding algorithmic decision making and algorithmic injustice require fundamental rethinking above and beyond technical solutions, and 2) outlines a way forward in a manner that centres vulnerable groups through the lens of relational ethics. △ Less

Submitted 16 December, 2019; originally announced December 2019.

Comments: Presented at the Black in AI workshop, @NeurIPS2019

Showing 1–24 of 24 results for author: Birhane, A