skip to main content
research-article

Why-oriented end-user debugging of naive Bayes text classification

Published: 31 October 2011 Publication History

Abstract

Machine learning techniques are increasingly used in intelligent assistants, that is, software targeted at and continuously adapting to assist end users with email, shopping, and other tasks. Examples include desktop SPAM filters, recommender systems, and handwriting recognition. Fixing such intelligent assistants when they learn incorrect behavior, however, has received only limited attention. To directly support end-user “debugging” of assistant behaviors learned via statistical machine learning, we present a Why-oriented approach which allows users to ask questions about how the assistant made its predictions, provides answers to these “why” questions, and allows users to interactively change these answers to debug the assistant's current and future predictions. To understand the strengths and weaknesses of this approach, we then conducted an exploratory study to investigate barriers that participants could encounter when debugging an intelligent assistant using our approach, and the information those participants requested to overcome these barriers. To help ensure the inclusiveness of our approach, we also explored how gender differences played a role in understanding barriers and information needs. We then used these results to consider opportunities for Why-oriented approaches to address user barriers and information needs.

References

[1]
Bandura, A. 1977. Self-efficacy: Toward a unifying theory of behavioral change. Psych. Rev. 8, 2, 191--215.
[2]
Becker, B., Kohavi, R., and Sommerfield, D. 2001. Visualizing the simple Bayesian classifier. In Information Visualization in Data Mining and Knowledge Discovery, U. Fayyad et al. Eds., 237--249.
[3]
Beckwith, L., Burnett, M., Wiedenbeck, S., Cook, C., Sorte, S., and Hastings, M. 2005. Effectiveness of end-user debugging software features: Are there gender issues? In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. 869--878.
[4]
Beckwith, L. 2007. Gender HCI issues in end-user programming. Ph.D. dissertation, Oregon State University, Corvallis, OR.
[5]
Burnett, M., Cook, C., Pendse, O., Rothermel, G., Summet, J., and Wallace, C. 2003. End-user software engineering with assertions in the spreadsheet paradigm. In Proceedings of the 25 International Conference of Software Engineering. 93--103.
[6]
Carroll, J. and Rosson, M. 1987. Paradox of the active user. In Interfacing Thought: Cognitive Aspects of Human-Computer Interaction, J. Carroll Ed., MIT Press, Cambridge, MA, 80--111.
[7]
Chan, H. and Darwiche, A. 2004. Sensitivity analysis in Bayesian networks: from single to multiple parameters. In Proceedings of the 20th Conference on Uncertainty in Artificial Intelligence. 67--75.
[8]
Chan, H. and Darwiche, A. 2002. When do numbers really matter? J. Artif. Intell. Res.17, 265--287.
[9]
Chen, J. and Weld, D. S. 2008. Recovering from errors during programming by demonstration. In Proceedings of the 13th International Conference on Intelligent User Interfaces. 159--168.
[10]
Cohen, W. 1996. Learning rules that classify e-mail. In Proceedings of the AAAI Spring Symposium on Machine Learning.
[11]
Compeau, D. and Higgins, C. 1995. Application of social cognitive theory to training for computer skills. Inf. Syst. Res. 6, 2, 118--143.
[12]
Davies, S. P. 1996. Display-based problem solving strategies in computer programming. In Proceedings of the 6th Workshop for Empirical Studies of Programmers. 59--76.
[13]
Glass, A., Mcguinness, D., and Wolverton, M. 2008. Toward establishing trust in adaptive agents. In Proceedings of the 13th International Conference on Intelligent User Interfaces. 227--236.
[14]
Grigoreanu, V., Cao, J., Kulesza, T., Bogart, C., Rector, K., Burnett, M., and Wiedenbeck, S. 2008. Can feature design reduce the gender gap in end-user software development environments? In Proceedings of Visual Learning and Human-Centric Computing. 149--156.
[15]
Guyon, I. and Elisseeff, A. 2003. An introduction to variable and feature selection. J. Mach. Learn. Res. 3, 1157--1182.
[16]
Hastie, T., Tibshirani, R., and Friedman, J. H. 2003. The Elements of Statistical Learning. Springer, Berlin.
[17]
Kapoor, A., Lee, B., Tan, D., and Horvitz, H. 2010. Interactive optimization for steering machine classification. In Proceedings of the Annual Conference on Human Factors in Computing Systems. 1343--1352.
[18]
Kissinger, C., Burnett, M., Stumpf, S., Subrahmaniyan, N., Beckwith, L., Yang, S., and Rosson, M. B. 2006. Supporting end-user debugging: What do users want to know?. In Proceedings of the Working Conference on Advanced Visual Interfaces. 135--142.
[19]
Ko, A. J. 2008. Asking and answering questions about the causes of software behaviors. Ph.D. dissertation; Tech. rep.CMU-CS-08-122, Human-Computer Interaction Institute, Carnegie Mellon University.
[20]
Ko, A. J., Myers, B., and Aung, H. 2004. Six learning barriers in end-user programming systems. In Proceedings of Visual learning and Human-Centric Computing. 199--206.
[21]
Kononenko, I. 1993. Inductive and Bayesian learning in medical diagnosis. Appl. Artif. Intell. 7, 317--337.
[22]
Kulesza, T., Wong, W., Stumpf, S., Perona, S., White, R., Burnett, M. M., Oberst, I., and Ko, A. J. 2009. Fixing the program my computer learned: Barriers for end users, challenges for the machine. In Proceedings of the 14th International Conference on Intelligent User Interfaces. 187--196.
[23]
Lacave, C. and Diez, F. 2002. A review of explanation methods for Bayesian networks.Knowl. Eng. Rev. 17, 2, 107--127.
[24]
Lieberman, H. (ed.) 2001. Your Wish is My Command: Programming By Example. Morgan Kaufmann.
[25]
Lim, B. Y., Dey, A. K., and Avrahami, D. 2009. Why and why not explanations improve the intelligibility of context-aware intelligent systems. In Proceedings of the Annual Conference on Human Factors in Computing Systems. 2119--2128.
[26]
Lim, B. Y. and Dey, A. K. 2009. Assessing demand for intelligibility in context-aware applications. In Proceedings of the 11th International Conference on Ubiquitous Computing. 195--204.
[27]
Little, G., Lau, T., Cypher, A., Lin, J., Haber, E., and Kandogan, E. 2007. Koala: Capture, share, automate, personalize business processes on the web. In Proceedings of the Annual Conference on Human Factors in Computing Systems. 943--946.
[28]
Liu, H., Lieberman, H., and Selker, T. 2003. A model of textual affect sensing using real-world knowledge. In Proceedings of the International Conference on Intelligent User Interfaces. 125--132.
[29]
Maron, M. E. 1961. Automatic indexing: An experimental inquiry. J. ACM $8$, 3, 404--417.
[30]
McDaniel, R. and Myers, B. 1999. Getting more out of programming-by-demonstration. In Proceedings of the Conference on Human Factors in Computing Systems. 442--449.
[31]
Meyers-Levy, J. 1989. Gender differences in information processing: A selectivity interpretation. Cognitive and Affective Responses to Advertising, P. Cafferata and A. Tybout Eds., Lexington Books.
[32]
Myers, B., Weitzman, D., Ko, A. J., and Chau, D. H. 2006. Answering why and why not questions in user interfaces. In Proceedings of the Conference on Human Factors in Computing Systems. 397--406.
[33]
Ng, A. Y. and Jordan, M. I. 2002. On discriminative vs generative classifiers: A comparison of logistic regression and naïve Bayes. Advances Neural Inf. Process. Syst. 14, 841--848.
[34]
Patel, K., Fogarty, J., Landay, J., and Harrison, B. 2008. Investigating statistical machine learning as a tool for software development. In Proceedings of the Annual SIGCHI Conference on Human Factors in Computing Systems. 667--676.
[35]
Poulin, B., Eisner, R., Szafron, D., Lu, P., Greiner, R., Wishart, D. S., Fyshe, A., Pearcy, B., Macdonell, C., and Anvik, J. 2006. Visual explanation of evidence in additive classifiers. In Proceedings of the AAAI Spring Symposium. 1822--1829.
[36]
Russell, S. J. and Norvig, P. 2003. Artificial Intelligence: A Modern Approach. Prentice Hall, Upper Saddle River, NJ.
[37]
Stumpf, S., Rajaram, V., Li, L., Burnett, M., Dietterich, T., Sullivan, E., Drummond, R., and Herlocker, J. 2007. Toward harnessing user feedback for machine learning. In Proceedings of the International Conference on Intelligent User Interfaces. 82--91.
[38]
Stumpf, S., Sullivan, E., Fitzhenry, E., Oberst, I., Wong, W.-K., and Burnett, M. 2008. Integrating rich user feedback into intelligent user interfaces. In Proceedings of the International Conference on Intelligent User Interfaces. 50--59.
[39]
Stumpf, S., Rajaram, V., Li, L., Wong, W.-K., Burnett, M., Dietterich, T., Sullivan, E., and Herlocker, J. 2009. Interacting meaningfully with machine learning systems: Three experiments. Int. J. Human-Comput. Stud. 67, 8, 639--662.
[40]
Subrahmaniyan, N., Kissinger, C., Rector, K., Inman, D., Kaplan, J., Beckwith, L., and Burnett, M. 2007. Explaining debugging strategies to end-user programmers. In Proceedings of the Visual Learning and Human- Centric Computing. 127--136.
[41]
Subrahmaniyan, N., Beckwith, L., Grigoreanu, V., Burnett, M., Wiedenbeck, S., Narayanan, V., Bucht, K., Drummond, R., and Fern, X. 2008. Testing vs. code inspection vs. … what else? Male and female end users' debugging strategies. In Proceedings of the 26th Annual SIGCHI Conference on Human Factors in Computing Systems. 617--626.
[42]
Talbot, J., Lee, B., Kapoor, A., and Tan, D. S. 2009. EnsembleMatrix: Interactive visualization to support machine learning with multiple classifiers. In Proceedings of the 27th International Conference on Human Factors in Computing Systems. 1283--1292.
[43]
Tullio, J., Dey, A., Chalecki, J., and Fogarty, J. 2007. How it works: A field study of nontechnical users interacting with an intelligent system. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. 31--40.
[44]
Ungar, D., Lieberman, H., and Fry, C. 1997. Debugging and the experience of immediacy. Comm. ACM 40, 4, 38--43.
[45]
Vander Zanden, B. and Myers, B. 1995. Demonstrational and constraint-based techniques for pictorially specifying application objects and behaviors. ACM Trans. Comput.-Human Interac. 2, 4, 308--356.
[46]
Wagner, E. and Lieberman, H. 2004. Supporting user hypotheses in problem diagnosis on the web and elsewhere. In Proceedings of the International Conference on Intelligent User Interfaces. 30--37.
[47]
Wong, W.-K., Oberst, I., Das, S., Moore, T., Stumpf, S., Mcintosh, K., and Burnett, M. 2011. End-user feature labeling: A locally-weighted regression approach. InProceedings of the International Conference on Intelligent User Interfaces. 115--124.

Cited By

View all

Index Terms

  1. Why-oriented end-user debugging of naive Bayes text classification

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Transactions on Interactive Intelligent Systems
    ACM Transactions on Interactive Intelligent Systems  Volume 1, Issue 1
    October 2011
    150 pages
    ISSN:2160-6455
    EISSN:2160-6463
    DOI:10.1145/2030365
    Issue’s Table of Contents
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 31 October 2011
    Accepted: 01 March 2011
    Revised: 01 February 2011
    Received: 01 March 2010
    Published in TIIS Volume 1, Issue 1

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Debugging
    2. end-user programming
    3. machine learning

    Qualifiers

    • Research-article
    • Research
    • Refereed

    Funding Sources

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)31
    • Downloads (Last 6 weeks)4
    Reflects downloads up to 22 Jan 2025

    Other Metrics

    Citations

    Cited By

    View all

    View Options

    Login options

    Full Access

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media