skip to main content
10.1145/3313831.3376447acmconferencesArticle/Chapter ViewAbstractPublication PageschiConference Proceedingsconference-collections
research-article

Silva: Interactively Assessing Machine Learning Fairness Using Causality

Published: 23 April 2020 Publication History

Abstract

Machine learning models risk encoding unfairness on the part of their developers or data sources. However, assessing fairness is challenging as analysts might misidentify sources of bias, fail to notice them, or misapply metrics. In this paper we introduce Silva, a system for exploring potential sources of unfairness in datasets or machine learning models interactively. Silva directs user attention to relationships between attributes through a global causal view, provides interactive recommendations, presents intermediate results, and visualizes metrics. We describe the implementation of Silva, identify salient design and technical challenges, and provide an evaluation of the tool in comparison to an existing fairness optimization tool.

Supplemental Material

MP4 File
Preview video
SBV File
Preview video captions

References

[1]
Alekh Agarwal, Alina Beygelzimer, Miroslav Dudík, John Langford, and Hanna Wallach. 2018. A reductions approach to fair classification. arXiv preprint arXiv:1803.02453 (2018).
[2]
Solon Barocas and Andrew D Selbst. 2016. Big data's disparate impact. Calif. L. Rev. 104 (2016), 671.
[3]
David Beer. 2009. Power through the algorithm? Participatory web cultures and the technological unconscious. New Media & Society 11, 6 (2009), 985--1002.
[4]
Rachel KE Bellamy, Kuntal Dey, Michael Hind, Samuel C Hoffman, Stephanie Houde, Kalapriya Kannan, Pranay Lohia, Jacquelyn Martino, Sameep Mehta, Aleksandra Mojsilovic, and others. 2018a. AI fairness 360: An extensible toolkit for detecting, understanding, and mitigating unwanted algorithmic bias. arXiv preprint arXiv:1810.01943 (2018).
[5]
Rachel K. E. Bellamy, Kuntal Dey, Michael Hind, Samuel C. Hoffman, Stephanie Houde, Kalapriya Kannan, Pranay Lohia, Jacquelyn Martino, Sameep Mehta, Aleksandra Mojsilovic, Seema Nagar, Karthikeyan Natesan Ramamurthy, John Richards, Diptikalyan Saha, Prasanna Sattigeri, Moninder Singh, Kush R. Varshney, and Yunfeng Zhang. 2018b. AI Fairness 360: An Extensible Toolkit for Detecting, Understanding, and Mitigating Unwanted Algorithmic Bias. (Oct. 2018). https://arxiv.org/abs/1810.01943
[6]
Alex Beutel, Jilin Chen, Tulsee Doshi, Hai Qian, Allison Woodruff, Christine Luu, Pierre Kreitmann, Jonathan Bischof, and Ed H Chi. 2019. Putting fairness principles into practice: Challenges, metrics, and improvements. arXiv preprint arXiv:1901.04562 (2019).
[7]
Peter J Bickel, Eugene A Hammel, and J William O'Connell. 1975. Sex bias in graduate admissions: Data from Berkeley. Science 187, 4175 (1975), 398--404.
[8]
Reuben Binns, Max Van Kleek, Michael Veale, Ulrik Lyngs, Jun Zhao, and Nigel Shadbolt. 2018. 'It's Reducing a Human Being to a Percentage': Perceptions of Justice in Algorithmic Decisions. In Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems. ACM, 377.
[9]
Bokeh Development Team. 2019. Bokeh: Python library for interactive visualization. https://bokeh.org/
[10]
Nigel Bosch, Sidney K D'Mello, Ryan S Baker, Jaclyn Ocumpaugh, Valerie Shute, Matthew Ventura, Lubin Wang, and Weinan Zhao. 2016. Detecting student emotions in computer-enabled classrooms. In IJCAI. 4125--4129.
[11]
Taina Bucher. 2017. The algorithmic imaginary: exploring the ordinary affects of Facebook algorithms. Information, Communication & Society 20, 1 (2017), 30--44.
[12]
Jenna Burrell. 2016. How the machine "thinks": Understanding opacity in machine learning algorithms. Big Data & Society 3, 1 (2016), 2053951715622512.
[13]
Flavio Calmon, Dennis Wei, Bhanukiran Vinzamuri, Karthikeyan Natesan Ramamurthy, and Kush R Varshney. 2017. Optimized pre-processing for discrimination prevention. In Advances in Neural Information Processing Systems. 3992--4001.
[14]
L Elisa Celis, Lingxiao Huang, Vijay Keswani, and Nisheeth K Vishnoi. 2019. Classification with fairness constraints: A meta-algorithm with provable guarantees. In Proceedings of the Conference on Fairness, Accountability, and Transparency. ACM, 319--328.
[15]
Sam Corbett-Davies, Emma Pierson, Avi Feller, Sharad Goel, and Aziz Huq. 2017. Algorithmic decision making and the cost of fairness. In Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 797--806.
[16]
Kate Crawford. 2016. Artificial intelligence's white guy problem. The New York Times 25 (2016).
[17]
Jeffrey Dastin. 2018. Rpt-insight-amazon scraps secret AI recruiting tool that showed bias against women. Reuters, 2018. (2018).
[18]
Thomas H Davenport and DJ Patil. 2012. Data scientist. Harvard business review 90, 5 (2012), 70--76.
[19]
Michael A DeVito, Jeremy Birnholtz, and Jeffery T Hancock. 2017. Platforms, people, and perception: Using affordances to understand self-presentation on social media. In Proceedings of the 2017 ACM conference on computer supported cooperative work and social computing. ACM, 740--754.
[20]
Cynthia Dwork, Moritz Hardt, Toniann Pitassi, Omer Reingold, and Richard Zemel. 2012. Fairness through awareness. In Proceedings of the 3rd innovations in theoretical computer science conference. ACM, 214--226.
[21]
Benjamin G Edelman and Michael Luca. 2014. Digital discrimination: The case of Airbnb. com. Harvard Business School NOM Unit Working Paper 14-054 (2014).
[22]
Andre Esteva, Brett Kuprel, Roberto A Novoa, Justin Ko, Susan M Swetter, Helen M Blau, and Sebastian Thrun. 2017. Dermatologist-level classification of skin cancer with deep neural networks. Nature 542, 7639 (2017), 115.
[23]
Michael Feldman, Sorelle A Friedler, John Moeller, Carlos Scheidegger, and Suresh Venkatasubramanian. 2015. Certifying and removing disparate impact. In Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 259--268.
[24]
Tarleton Gillespie, Pablo J Boczkowski, and Kirsten A Foot. 2014. Media technologies: Essays on communication, materiality, and society. MIT Press.
[25]
Bryce Goodman and Seth Flaxman. 2017. European Union regulations on algorithmic decision-making and "right to explanation". AI Magazine 38, 3 (2017), 50--57.
[26]
Google. 2017. What-if Tool. (2017). https://pair-code.github.io/what-if-tool/
[27]
Bernard E Harcourt. 2008. Against prediction: Profiling, policing, and punishing in an actuarial age. University of Chicago Press.
[28]
Moritz Hardt, Eric Price, and Nati Srebro. 2016. Equality of opportunity in supervised learning. In Advances in neural information processing systems. 3315--3323.
[29]
Kenneth Holstein, Jennifer Wortman Vaughan, Hal Daumé III, Miro Dudik, and Hanna Wallach. 2019. Improving fairness in machine learning systems: What do industry practitioners need?. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems. ACM, 600.
[30]
Tommi Jaakkola, David Sontag, Amir Globerson, and Marina Meila. 2010. Learning Bayesian network structure using LP relaxations. In Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics. 358--365.
[31]
Michael I Jordan. 2003. An introduction to probabilistic graphical models. (2003).
[32]
Faisal Kamiran and Toon Calders. 2012. Data preprocessing techniques for classification without discrimination. Knowledge and Information Systems 33, 1 (2012), 1--33.
[33]
Faisal Kamiran, Asim Karim, and Xiangliang Zhang. 2012. Decision theory for discrimination-aware classification. In 2012 IEEE 12th International Conference on Data Mining. IEEE, 924--929.
[34]
Toshihiro Kamishima, Shotaro Akaho, Hideki Asoh, and Jun Sakuma. 2012. Fairness-aware classifier with prejudice remover regularizer. In Joint European Conference on Machine Learning and Knowledge Discovery in Databases. Springer, 35--50.
[35]
Matthew Kay, Cynthia Matuszek, and Sean A Munson. 2015. Unequal representation and gender stereotypes in image search results for occupations. In Proceedings of the 33rd Annual ACM Conference on Human Factors in Computing Systems. ACM, 3819--3828.
[36]
Niki Kilbertus, Mateo Rojas Carulla, Giambattista Parascandolo, Moritz Hardt, Dominik Janzing, and Bernhard Schölkopf. 2017. Avoiding discrimination through causal reasoning. In Advances in Neural Information Processing Systems. 656--666.
[37]
Jon Kleinberg, Sendhil Mullainathan, and Manish Raghavan. 2016. Inherent trade-offs in the fair determination of risk scores. arXiv preprint arXiv:1609.05807 (2016).
[38]
Daphne Koller and Nir Friedman. 2009. Probabilistic graphical models: principles and techniques. MIT press.
[39]
Tim Kraska. 2018. Northstar: An interactive data science system. Proceedings of the VLDB Endowment 11, 12 (2018), 2150--2164.
[40]
Matt J Kusner, Joshua Loftus, Chris Russell, and Ricardo Silva. 2017. Counterfactual fairness. In Advances in Neural Information Processing Systems. 4066--4076.
[41]
Jeff Larson, Surya Mattu, Lauren Kirchner, and Julia Angwin. 2016. How we analyzed the COMPAS recidivism algorithm. ProPublica (5 2016) 9 (2016).
[42]
Min Kyung Lee. 2018. Understanding perception of algorithmic decisions: Fairness, trust, and emotion in response to algorithmic management. Big Data & Society 5, 1 (2018), 2053951718756684.
[43]
Joshua R Loftus, Chris Russell, Matt J Kusner, and Ricardo Silva. 2018. Causal reasoning for algorithmic fairness. arXiv preprint arXiv:1805.05859 (2018).
[44]
Caitlin Lustig and Bonnie Nardi. 2015. Algorithmic authority: The case of Bitcoin. In 2015 48th Hawaii International Conference on System Sciences. IEEE, 743--752.
[45]
James Massey. 1990. Causality, feedback and directed information. In Proc. Int. Symp. Inf. Theory Applic.(ISITA-90). Citeseer, 303--305.
[46]
Razieh Nabi and Ilya Shpitser. 2018. Fair inference on outcomes. In Thirty-Second AAAI Conference on Artificial Intelligence.
[47]
Judea Pearl and others. 2009. Causal inference in statistics: An overview. Statistics surveys 3 (2009), 96--146.
[48]
Peter Pirolli and Stuart Card. 2005. The sensemaking process and leverage points for analyst technology as identified through cognitive task analysis. In Proceedings of international conference on intelligence analysis, Vol. 5. McLean, VA, USA, 2--4.
[49]
Angelisa C Plane, Elissa M Redmiles, Michelle L Mazurek, and Michael Carl Tschantz. 2017. Exploring user perceptions of discrimination in online targeted advertising. In 26th USENIX Security Symposium (USENIX Security 17). 935--951.
[50]
Geoff Pleiss, Manish Raghavan, Felix Wu, Jon Kleinberg, and Kilian Q Weinberger. 2017. On fairness and calibration. In Advances in Neural Information Processing Systems. 5680--5689.
[51]
Babak Salimi, Corey Cole, Peter Li, Johannes Gehrke, and Dan Suciu. 2018a. HypDB: a demonstration of detecting, explaining and resolving bias in OLAP queries. Proceedings of the VLDB Endowment 11, 12 (2018), 2062--2065.
[52]
Babak Salimi, Johannes Gehrke, and Dan Suciu. 2018b. Bias in olap queries: Detection, explanation, and removal. In Proceedings of the 2018 International Conference on Management of Data. ACM, 1021--1035.
[53]
Babak Salimi, Luke Rodriguez, Bill Howe, and Dan Suciu. 2019. Interventional Fairness: Causal Database Repair for Algorithmic Fairness. In Proceedings of the 2019 International Conference on Management of Data. ACM, 793--810.
[54]
Nripsuta Ani Saxena, Karen Huang, Evan DeFilippis, Goran Radanovic, David C Parkes, and Yang Liu. 2019. How Do Fairness Definitions Fare?: Examining Public Attitudes Towards Algorithmic Definitions of Fairness. In Proceedings of the 2019 AAAI/ACM Conference on AI, Ethics, and Society. ACM, 99--106.
[55]
Till Speicher, Hoda Heidari, Nina Grgic-Hlaca, Krishna P Gummadi, Adish Singla, Adrian Weller, and Muhammad Bilal Zafar. 2018. A Unified Approach to Quantifying Algorithmic Unfairness: Measuring Individual &Group Unfairness via Inequality Indices. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. ACM, 2239--2248.
[56]
Jina Suh, Soroush Ghorashi, Gonzalo Ramos, Nan-Chen Chen, Steven Drucker, Johan Verwey, and Patrice Simard. 2019. AnchorViz: Facilitating Semantic Data Exploration and Concept Discovery for Interactive Machine Learning. ACM Transactions on Interactive Intelligent Systems (TiiS) 10, 1 (2019), 7.
[57]
Astra Taylor and Jathan Sadowski. 2015. How companies turn your Facebook activity into a credit score. The Nation 27 (2015).
[58]
Ioannis Tsamardinos, Laura E Brown, and Constantin F Aliferis. 2006. The max-min hill-climbing Bayesian network structure learning algorithm. Machine learning 65, 1 (2006), 31--78.
[59]
Lisa Tweedie, Bob Spence, David Williams, and Ravinder Bhogal. 1994. The attribute explorer. In Conference companion on Human factors in computing systems. ACM, 435--436.
[60]
Michael Veale, Max Van Kleek, and Reuben Binns. 2018. Fairness and accountability design needs for algorithmic support in high-stakes public sector decision-making. In Proceedings of the 2018 chi conference on human factors in computing systems. ACM, 440.
[61]
Clifford H Wagner. 1982. Simpson's paradox in real life. The American Statistician 36, 1 (1982), 46--48.
[62]
Jeffrey Warshaw, Nina Taft, and Allison Woodruff. 2016. Intuitions, Analytics, and Killing Ants: Inference Literacy of High School-educated Adults in the US. In Twelfth Symposium on Usable Privacy and Security (SOUPS). 271--285.
[63]
Allison Woodruff, Sarah E Fox, Steven Rousso-Schindler, and Jeffrey Warshaw. 2018. A qualitative exploration of perceptions of algorithmic fairness. In Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems. ACM, 656.
[64]
Blake Woodworth, Suriya Gunasekar, Mesrob I Ohannessian, and Nathan Srebro. 2017. Learning non-discriminatory predictors. arXiv preprint arXiv:1702.06081 (2017).
[65]
Yongkai Wu, Lu Zhang, Xintao Wu, and Hanghang Tong. 2019. PC-Fairness: A Unified Framework for Measuring Causality-based Fairness. In Advances in Neural Information Processing Systems. 3399--3409.
[66]
Qian Yang, Alex Scuito, John Zimmerman, Jodi Forlizzi, and Aaron Steinfeld. 2018. Investigating how experienced UX designers effectively work with machine learning. In Proceedings of the 2018 Designing Interactive Systems (DIS) Conference. ACM, 585--596.
[67]
Rich Zemel, Yu Wu, Kevin Swersky, Toni Pitassi, and Cynthia Dwork. 2013. Learning fair representations. In International Conference on Machine Learning. 325--333.
[68]
Brian Hu Zhang, Blake Lemoine, and Margaret Mitchell. 2018. Mitigating unwanted biases with adversarial learning. In Proceedings of the 2018 AAAI/ACM Conference on AI, Ethics, and Society. ACM, 335--340.
[69]
Junzhe Zhang and Elias Bareinboim. 2018. Fairness in decision-making the causal explanation formula. In Thirty-Second AAAI Conference on Artificial Intelligence.

Cited By

View all

Index Terms

  1. Silva: Interactively Assessing Machine Learning Fairness Using Causality

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    CHI '20: Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems
    April 2020
    10688 pages
    ISBN:9781450367080
    DOI:10.1145/3313831
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 23 April 2020

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. bias
    2. interactive system
    3. machine learning fairness

    Qualifiers

    • Research-article

    Funding Sources

    Conference

    CHI '20
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 6,199 of 26,314 submissions, 24%

    Upcoming Conference

    CHI 2025
    ACM CHI Conference on Human Factors in Computing Systems
    April 26 - May 1, 2025
    Yokohama , Japan

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)129
    • Downloads (Last 6 weeks)10
    Reflects downloads up to 05 Feb 2025

    Other Metrics

    Citations

    Cited By

    View all

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    HTML Format

    View this article in HTML Format.

    HTML Format

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media