skip to main content
10.1145/3404835.3462850acmconferencesArticle/Chapter ViewAbstractPublication PagesirConference Proceedingsconference-collections
research-article

When Fair Ranking Meets Uncertain Inference

Published: 11 July 2021 Publication History

Abstract

Existing fair ranking systems, especially those designed to be demographically fair, assume that accurate demographic information about individuals is available to the ranking algorithm. In practice, however, this assumption may not hold --- in real-world contexts like ranking job applicants or credit seekers, social and legal barriers may prevent algorithm operators from collecting peoples' demographic information. In these cases, algorithm operators may attempt to infer peoples' demographics and then supply these inferences as inputs to the ranking algorithm.
In this study, we investigate how uncertainty and errors in demographic inference impact the fairness offered by fair ranking algorithms. Using simulations and three case studies with real datasets, we show how demographic inferences drawn from real systems can lead to unfair rankings. Our results suggest that developers should not use inferred demographic data as input to fair ranking algorithms, unless the inferences are extremely accurate.

Supplementary Material

MOV File (SIGIR_Rec.mov)
Presentation video - 15 min version as presented at SIGIR 2021

References

[1]
Dzifa Adjaye-Gbewonyo, Robert A Bednarczyk, Robert L Davis, and Saad BOmer. 2014. Using the Bayesian Improved Surname Geocoding Method (BISG)to create a working classification of race and ethnicity in a diverse managed care population: a validation study. Health services research 49, 1 (2014), 268--283.
[2]
Alekh Agarwal, Miroslav Dudík, and Zhiwei Steven Wu. 2019. Fair regression:Quantitative definitions and reduction-based algorithms. arXiv preprint arXiv:1905.12843(2019).
[3]
McKane Andrus, Elena Spitzer, Jeffrey Brown, and Alice Xiang. 2021. WhatWe Can't Measure, We Can't Understand: Challenges to Demographic Data Procurement in the Pursuit of Fairness. In Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency(Virtual Event, Canada)(FAccT '21). Association for Computing Machinery, New York, NY, USA, 249--260. https://doi.org/10.1145/3442188.3445888
[4]
Solon Barocas and Andrew D. Selbst. 2016. Big Data's Disparate Impact. 104 California Law Review 671 (2016).
[5]
Sid Basu, Ruthie Berman, Adam Bloomston, John Campbell, Anne Diaz, Nanako Era, Benjamin Evans, Sukhada Palkar, and Skyler Wharton. 2020. Measuring discrepancies in Airbnb guest acceptance rates using anonymized demographic data. AirBNB. https://news.airbnb.com/wp-content/uploads/sites/4/2020/06/Project-Lighthouse-Airbnb-2020-06--12.pdf.
[6]
Rachel KE Bellamy, Kuntal Dey, Michael Hind, Samuel C Hoffman, Stephanie Houde, Kalapriya Kannan, Pranay Lohia, Jacquelyn Martino, Sameep Mehta, Aleksandra Mojsilovic, et al.2018. AI Fairness 360: An extensible toolkit for detecting, understanding, and mitigating unwanted algorithmic bias. arXivpreprint arXiv: 1810.01943(2018).
[7]
Richard Berk, Hoda Heidari, Shahin Jabbari, Matthew Joseph, Michael Kearns, Jamie Morgenstern, Seth Neel, and Aaron Roth. 2017. A convex framework for fair regression. arXiv preprint arXiv:1706.02409(2017).
[8]
Alex Beutel, Jilin Chen, Tulsee Doshi, Hai Qian, Li Wei, Yi Wu, Lukasz Heldt, Zhe Zhao, Lichan Hong, Ed H. Chi, and Cristos Goodrow. 2019. Fairness in Recommendation Ranking through Pairwise Comparisons. In KDD. https://arxiv.org/pdf/1903.00780.pdf
[9]
Asia J Biega, Krishna P Gummadi, and Gerhard Weikum. 2018. Equity of attention: Amortizing individual fairness in rankings. In The 41st international acm sigir conference on research & development in information retrieval. 405--414.
[10]
Miranda Bogen, Aaron Rieke, and Shazeda Ahmed. 2020. Awareness in practice: tensions in access to sensitive attribute data for anti-discrimination. In Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency. 492--500.
[11]
Tolga Bolukbasi, Kai-Wei Chang, James Y Zou, Venkatesh Saligrama, and Adam T Kalai. 2016. Man is to computer programmer as woman is to homemaker-debiasing word embeddings. In Advances in neural information processing systems. 4349--4357.
[12]
Marc-Etienne Brunet, Colleen Alkalay-Houlihan, Ashton Anderson, and Richard Zemel. 2019. Understanding the origins of bias in word embeddings. In International Conference on Machine Learning. 803--811.
[13]
Joy Buolamwini and Timnit Gebru. 2018.Gender shades: Intersectional accuracy disparities in commercial gender classification. In Conference on fairness, accountability and transparency. 77--91.
[14]
Consumer Financial Protection Bureau. 2014. Using publicly available information to proxy for unidentified race and ethnicity. Report available athttp://files. consumerfinance. gov/f/201409_cfpb_report_ proxy-methodology.pdf(2014).
[15]
L Elisa Celis, Lingxiao Huang, Vijay Keswani, and Nisheeth K Vishnoi. 2019. Classification with fairness constraints: A meta-algorithm with provable guarantees. In Proceedings of the Conference on Fairness, Accountability, and Transparency. 319--328.
[16]
L Elisa Celis, Lingxiao Huang, and Nisheeth K Vishnoi. 2020. Fair Classification with Noisy Protected Attributes. arXiv preprint arXiv:2006.04778(2020).
[17]
L Elisa Celis and Vijay Keswani. 2020. Implicit Diversity in Image Summarization. Proceedings of the ACM on Human-Computer Interaction 4, CSCW2 (2020), 1--28.
[18]
L Elisa Celis, Damian Straszak, and Nisheeth K Vishnoi. 2018. Ranking with Fairness Constraints. In 45th International Colloquium on Automata,Languages, and Programming (ICALP 2018). Schloss Dagstuhl-Leibniz-Zentrumfuer Informatik.
[19]
Jiahao Chen, Nathan Kallus, Xiaojie Mao, Geoffry Svacha, and Madeleine Udell. 2019. Fairness under unawareness: Assessing disparity when protected class is unobserved. In Proceedings of the conference on fairness, accountability, and transparency. 339--348.
[20]
Le Chen, Ruijun Ma, Aniko Hannák, and Christo Wilson. 2018. Investigating the Impact of Gender on Rank in Resume Search Engines. In Proc. of CHI.
[21]
Nicholas Diakopoulos, Daniel Trielli, Jennifer Stark, and Sean Mussenden. 2018. I Vote For-How Search Informs Our Choice of Candidate. In Digital Dominance:The Power of Google, Amazon, Facebook, and Apple, M. Moore and D. Tambini(Eds.). 22.
[22]
Fernando Diaz, Bhaskar Mitra, Michael D Ekstrand, Asia J Biega, and Ben Carterette. 2020. Evaluating stochastic rankings with expected exposure. In Proceedings of the 29th ACM International Conference on Information & Knowledge Management. 275--284.
[23]
Cynthia Dwork, Moritz Hardt, Toniann Pitassi, Omer Reingold, and Richard Zemel. 2012. Fairness through awareness. In Proceedings of the 3rd innovations in theoretical computer science conference. 214--226.
[24]
Jessica Fjeld, Nele Achten, Hannah Hilligoss, Adam Nagy, and Madhulika Srikumar. 2020. Principled Artificial Intelligence: Mapping Consensus in Ethical and Rights-Based Approaches to Principles for AI.Berkman Klein Center Research Publication 2020, 1 (2020). https://ssrn.com/abstract=3518482
[25]
Joel Escudé Font and Marta R Costa-Jussa. 2019. Equalizing gender biases in neural machine translation with word embeddings techniques.arXiv preprintarXiv: 1901.03116(2019).
[26]
James R Foulds, Rashidul Islam, Kamrun Naher Keya, and Shimei Pan. 2020. An intersectional definition of fairness. In 2020 IEEE 36th International Conference on Data Engineering (ICDE). IEEE, 1918--1921.
[27]
Sorelle A Friedler, Carlos Scheidegger, Suresh Venkatasubramanian, Sonam Choudhary, Evan P Hamilton, and Derek Roth. 2019. A comparative study of fairness-enhancing interventions in machine learning. In Proceedings of the conference on fairness, accountability, and transparency. 329--338.
[28]
R Stuart Geiger, Kevin Yu, Yanlai Yang, Mindy Dai, Jie Qiu, Rebekah Tang, and Jenny Huang. 2020. Garbage in, garbage out? do machine learning application papers in social computing report where human-labeled training data comes from?. In Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency. 325--336.
[29]
Sahin Cem Geyik, Stuart Ambler, and Krishnaram Kenthapadi. 2019. Fairness-aware ranking in search & recommendation systems with application to linked in talent search. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 2221--2231.
[30]
Avijit Ghosh, Lea Genuit, and Mary Reagan. 2021. Characterizing Intersectional Group Fairness with Worst-Case Comparisons. arXiv:2101.01673 [cs.LG]
[31]
Naman Goel, Mohammad Yaghini, and Boi Faltings. 2018. Non-discriminatory machine learning through convex fairness criteria. In Proceedings of the 2018 AAAI/ACM Conference on AI, Ethics, and Society. 116--116.
[32]
Alex Hanna, Emily Denton, Andrew Smart, and Jamila Smith-Loud. 2020. Towards a critical race methodology in algorithmic fairness. In Proceedings of the 2020 conference on fairness, accountability, and transparency. 501--512.
[33]
Anikó Hannák, Claudia Wagner, David Garcia, Alan Mislove, Markus Strohmaier, and Christo Wilson. 2017. Bias in Online Freelance Marketplaces: Evidence from Task Rabbit and Fiverr. In Proc. of CSCW.
[34]
Moritz Hardt, Eric Price, and Nati Srebro. 2016. Equality of opportunity in supervised learning. In Advances in neural information processing systems. 3315--3323.
[35]
Anna Lauren Hoffmann. 2020. Terms of inclusion: Data, discourse, violence. New Media & Society(Sept. 2020).
[36]
Bas Hofstra, Vivek V Kulkarni, Sebastian Munoz-Najar Galvez, Bryan He, Dan Jurafsky, and Daniel A McFarland. 2020. The Diversity--Innovation Paradox in Science. Proceedings of the National Academy of Sciences117, 17 (2020), 9284--9291.
[37]
Lily Hu and Issa Kohler-Hausmann. 2020. What's Sex Got To Do With Machine Learning. arXiv preprint arXiv:2006.01770(2020).
[38]
Lingxiao Huang and Nisheeth K Vishnoi. 2019. Stable and fair classification. arXiv preprint arXiv:1902.07823(2019).
[39]
Eslam Hussein, Prerna Juneja, and Tanushree Mitra. 2020. Measuring Misinformation in Video Search Platforms: An Audit Study on YouTube. Proc. ACM Hum.-Comput. Interact. 4, CSCW1 (May 2020).
[40]
Shahin Jabbari, Matthew Joseph, Michael Kearns, Jamie Morgenstern, and AaronRoth. 2017. Fairness in reinforcement learning. In International Conference on Machine Learning. PMLR, 1617--1626.
[41]
Kalervo Järvelin and Jaana Kekäläinen. 2002. Cumulated gain-based evaluation of IR techniques. ACM Transactions on Information Systems (TOIS)20, 4 (2002), 422--446.
[42]
Eun Seo Jo and Timnit Gebru. 2020. Lessons from archives: Strategies for collecting sociocultural data in machine learning. In Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency. 306--316.
[43]
Toshihiro Kamishima, Shotaro Akaho, Hideki Asoh, and Jun Sakuma. 2012. Fairness-aware classifier with prejudice remover regularizer. In Joint European Conference on Machine Learning and Knowledge Discovery in Databases. Springer, 35--50.
[44]
Kimmo Kärkkäinen and Jungseock Joo. 2019. Fairface: Face attribute dataset for balanced race, gender, and age.arXiv preprint arXiv:1908.04913(2019).
[45]
Anna Kawakami, Khonzoda Umarova, Dongchen Huang, and Eni Mustafaraj. 2020. The 'Fairness Doctrine' Lives on? Theorizing about the Algorithmic News Curation of Google's Top Stories. In Proc. of HT.
[46]
Matthew Kay, Cynthia Matuszek, and Sean A. Munson. 2015. Unequal Representation and Gender Stereotypes in Image Search Results for Occupations. In Proc. of CHI.
[47]
Yoon Kim. 2014. Convolutional Neural Networks for Sentence Classification. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP). Association for Computational Linguistics, Doha, Qatar,1746--1751. https://doi.org/10.3115/v1/D14--1181
[48]
Caitlin Kuhlman, MaryAnn Van Valkenburg, and Elke Rundensteiner. 2019. Fare:Diagnostics for fair ranking using pairwise error metrics. In The World Wide Web Conference. 2936--2942.
[49]
Juhi Kulshrestha, Motahhare Eslami, Johnnatan Messias, Muhammad Bilal Zafar,Saptarshi Ghosh, Krishna P. Gummadi, and Karrie Karahalios. 2017. Quantifying Search Bias: Investigating Sources of Bias for Political Searches in Social Media. In Proc. of CSCW.
[50]
Jure Leskovec and Eric Horvitz. 2008.Planetary-Scale Views on a Large Instant-Messaging Network. In Proceedings of the 17th International Conference on World Wide Web(Beijing, China)(WWW '08). Association for Computing Machinery, New York, NY, USA, 915--924.https://doi.org/10.1145/1367497.1367620
[51]
Joshua R Loftus, Chris Russell, Matt J Kusner, and Ricardo Silva. 2018. Causalreasoning for algorithmic fairness.arXiv preprint arXiv:1805.05859(2018).
[52]
Scott M Lundberg and Su-In Lee. 2017. A Unified Approach to Interpreting Model Predictions. In Advances in Neural Information Processing Systems 30, I. Guyon, U. V. Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett(Eds.). Curran Associates, Inc., 4765--4774. http://papers.nips.cc/paper/7062-a-unified-approach-to-interpreting-model-predictions.pdf
[53]
Emma Lurie and Eni Mustafaraj. 2018. Investigating the Effects of Google's Search Engine Result Page in Evaluating the Credibility of Online News Sources. In Proc. of WebSci.
[54]
Christopher D. Manning, Prabhakar Raghavan, and Hinrich Schütze. 2009. Evaluation in information retrieval. Cambridge University Press, Chapter 8, 151--175.
[55]
Ninareh Mehrabi, Fred Morstatter, Nripsuta Saxena, Kristina Lerman, and Aram Galstyan. 2019. A survey on bias and fairness in machine learning.arXiv preprint arXiv:1908.09635(2019).
[56]
Anay Mehrotra and L Elisa Celis. 2020. Mitigating Bias in Set Selection with Noisy Protected Attributes. arXiv preprint arXiv:2011.04219(2020).
[57]
Aditya Krishna Menon and Robert C Williamson. 2018. The cost of fairness in binary classification. In Conference on Fairness, Accountability and Transparency. 107--118.
[58]
Marco Morik, Ashudeep Singh, Jessica Hong, and Thorsten Joachims. 2020. Controlling Fairness and Bias in Dynamic Learning-to-Rank.arXiv preprint arXiv:2005.14713(2020).
[59]
Ankan Mullick, Sayan Ghosh, Ritam Dutt, Avijit Ghosh, and Abhijnan Chakraborty. 2019. Public Sphere 2.0: Targeted Commenting in Online News Media. In European Conference on Information Retrieval. Springer, 180--187.
[60]
Razieh Nabi and Ilya Shpitser. 2018. Fair inference on outcomes. In Proceedings of the... AAAI Conference on Artificial Intelligence. AAAI Conference on Artificial Intelligence, Vol. 2018. NIH Public Access, 1931.
[61]
Jakob Nielsen. 2003. Usability 101: introduction to usability. Jakob Nielsen's Alertbox.
[62]
Ziad Obermeyer, Brian Powers, Christine Vogeli, and Sendhil Mullainathan. 2019. Dissecting racial bias in an algorithm used to manage the health of populations. Science 366 (Oct. 2019).
[63]
Osonde A Osoba and William Welser IV. 2017. An intelligence in our image: The risks of bias and errors in artificial intelligence. Rand Corporation.
[64]
Amifa Raj, Connor Wood, Ananda Montoly, and Michael D Ekstrand. 2020. Comparing Fair Ranking Metrics. arXiv preprint arXiv:2009.01311(2020).
[65]
Marco Tulio Ribeiro, Sameer Singh, and Carlos Guestrin. 2016. "Why should Itrust you?" Explaining the predictions of any classifier. In Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining. 1135--1144.
[66]
Ronald E Robertson, Shan Jiang, Kenneth Joseph, Lisa Friedland, David Lazer,and Christo Wilson. 2018. Auditing Partisan Audience Bias within Google Search.Proceedings of the ACM: Human-Computer Interaction2, CSCW (November 2018).
[67]
Ronald E. Robertson, Shan Jiang, David Lazer, and Christo Wilson. 2019. Auditing Autocomplete: Recursive Algorithm Interrogation and Suggestion Networks. In Proc. of Web Sci.
[68]
Ronald E Robertson, David Lazer, and Christo Wilson. 2018. Auditing the Personalization and Composition of Politically-Related Search Engine Results Pages. In Proc. of WWW.
[69]
José A Sáez, Mikel Galar, JuliáN Luengo, and Francisco Herrera. 2013. Tackling the problem of classification with noisy data using multiple classifier systems:Analysis of the performance and robustness.Information Sciences247 (2013),1--20.
[70]
Lucia Santamaría and Helena Mihaljevic. 2018. Comparison and benchmark of name-to-gender inference services. PeerJ Computer Science 4 (2018), e156.
[71]
Piotr Sapiezynski, Wesley Zeng, Ronald E Robertson, Alan Mislove, and Christo Wilson. 2019. Quantifying the Impact of User Attention on Fair Group Representation in Ranked Lists. In Companion Proceedings of The 2019 World Wide Web Conference. 553--562.
[72]
Sefik Ilkin Serengil and Alper Ozpinar. 2020. Light Face: A Hybrid Deep Face Recognition Framework. In 2020 Innovations in Intelligent Systems and Applications Conference (ASYU). IEEE.
[73]
Ashudeep Singh and Thorsten Joachims. 2018. Fairness of exposure in rankings. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 2219--2228.
[74]
Gaurav Sood and Suriyan Laohaprapanon. 2018. Predicting Race and Ethnicity From the Sequence of Characters in a Name. arXiv:1805.02109 [stat.AP]
[75]
Lisa Stryjewski. 2010. 40 years of box plots. (2010).
[76]
Yaniv Taigman, Ming Yang, Marc'Aurelio Ranzato, and Lior Wolf. 2014. Deepface: Closing the gap to human-level performance in face verification. In Proceedings of the IEEE conference on computer vision and pattern recognition. 1701--1708.
[77]
James Wexler, Mahima Pushkarna, Tolga Bolukbasi, Martin Wattenberg, Fernanda Viégas, and Jimbo Wilson. 2019. The what-if tool: Interactive probing of machine learning models. IEEE transactions on visualization and computer graphics 26, 1(2019), 56--65.
[78]
Ke Yang and Julia Stoyanovich. 2017. Measuring fairness in ranked outputs. In Proceedings of the 29th International Conference on Scientific and Statistical Database Management. 1--6.
[79]
Junting Ye, Shuchu Han, Yifan Hu, Baris Coskun, Meizhu Liu, Hong Qin, and Steven Skiena. 2017. Nationality Classification Using Name Embeddings. In Proceedings of the 2017 ACM on Conference on Information and Knowledge Management(Singapore, Singapore) (CIKM '17). Association for Computing Machinery, New York, NY, USA, 1897--1906. https://doi.org/10.1145/3132847.3133008
[80]
Muhammad Bilal Zafar, Isabel Valera, Manuel Gomez Rogriguez, and Krishna P Gummadi. 2017. Fairness constraints: Mechanisms for fair classification. In Artificial Intelligence and Statistics. PMLR, 962--970.
[81]
Meike Zehlike, Francesco Bonchi, Carlos Castillo, Sara Hajian, Mohamed Megahed, and Ricardo Baeza-Yates. 2017. Fa* ir: A fair top-k ranking algorithm. In Proceedings of the 2017 ACM on Conference on Information and Knowledge Management. 1569--1578.
[82]
Meike Zehlike and Carlos Castillo. 2020. Reducing disparate exposure in ranking: A learning to rank approach. In Proceedings of The Web Conference 2020. 2849--2855.

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
SIGIR '21: Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval
July 2021
2998 pages
ISBN:9781450380379
DOI:10.1145/3404835
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 11 July 2021

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. algorithmic fairness
  2. demographic inference
  3. ethical AI
  4. noisy protected attributes
  5. ranking algorithms
  6. uncertainty

Qualifiers

  • Research-article

Funding Sources

  • Sloan Foundation, Sloan Fellowship 2019

Conference

SIGIR '21
Sponsor:

Acceptance Rates

Overall Acceptance Rate 792 of 3,983 submissions, 20%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)92
  • Downloads (Last 6 weeks)17
Reflects downloads up to 28 Dec 2024

Other Metrics

Citations

Cited By

View all

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media