Article

Explainable Reinforcement Learning: A Survey

Authors:

Eric M. S. P. VeithAuthors Info & Claims

Machine Learning and Knowledge Extraction: 4th IFIP TC 5, TC 12, WG 8.4, WG 8.9, WG 12.9 International Cross-Domain Conference, CD-MAKE 2020, Dublin, Ireland, August 25–28, 2020, Proceedings

Pages 77 - 95

https://doi.org/10.1007/978-3-030-57321-8_5

Published: 25 August 2020 Publication History

Abstract

Explainable Artificial Intelligence (XAI), i.e., the development of more transparent and interpretable AI models, has gained increased traction over the last few years. This is due to the fact that, in conjunction with their growth into powerful and ubiquitous tools, AI models exhibit one detrimental characteristic: a performance-transparency trade-off. This describes the fact that the more complex a model’s inner workings, the less clear it is how its predictions or decisions were achieved. But, especially considering Machine Learning (ML) methods like Reinforcement Learning (RL) where the system learns autonomously, the necessity to understand the underlying reasoning for their decisions becomes apparent. Since, to the best of our knowledge, there exists no single work offering an overview of Explainable Reinforcement Learning (XRL) methods, this survey attempts to address this gap. We give a short summary of the problem, a definition of important terms, and offer a classification and assessment of current XRL methods. We found that a) the majority of XRL methods function by mimicking and simplifying a complex model instead of designing an inherently simple one, and b) XRL (and XAI) methods often neglect to consider the human side of the equation, not taking into account research from related fields like psychology or philosophy. Thus, an interdisciplinary effort is needed to adapt the generated explanations to a (non-expert) human user in order to effectively progress in the field of XRL and XAI in general.

References

[1]

Abdul, A., Vermeulen, J., Wang, D., Lim, B.Y., Kankanhalli, M.: Trends and trajectories for explainable, accountable and intelligible systems. In: Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems - CHI 2018. ACM Press (2018)

[2]

Adadi A and Berrada M Peeking inside the black-box: a survey on explainable artificial intelligence (XAI) IEEE Access 2018 6 52138-52160

[3]

Andreas, J., Klein, D., Levine, S.: Modular multitask reinforcement learning with policy sketches. In: Proceedings of the 34th International Conference on Machine Learning, ICML 2017, vol. 70, pp. 166–175. JMLR.org (2017)

[4]

Arya, V., et al.: One explanation does not fit all: a toolkit and taxonomy of AI explainability techniques (2019). arXiv:1909.03012

[5]

Bevana, N., Kirakowskib, J., Maissela, J.: What is usability. In: Proceedings of the 4th International Conference on HCI. Citeseer (1991)

[6]

Brockman, G., Cheung, V., Pettersson, L., Schneider, J., Schulman, J., Tang, J., Zaremba, W.: OpenAI gym (2016). arXiv:1606.01540

[7]

Carvalho DV, Pereira EM, and Cardoso JS Machine learning interpretability: a survey on methods and metrics Electronics 2019 8 8 832

[8]

Chakraborty, S., et al.: Interpretability of deep learning models: a survey of results. In: 2017 IEEE SmartWorld, Ubiquitous Intelligence & Computing, Advanced & Trusted Computed, Scalable Computing & Communications, Cloud & Big Data Computing, Internet of People and Smart City Innovation (SmartWorld/SCALCOM/UIC/ATC/CBDCom/IOP/SCI). IEEE (2017)

[9]

Coppens, Y., et al.: Distilling deep reinforcement learning policies in soft decision trees. In: Proceedings of the IJCAI 2019 Workshop on Explainable Artificial Intelligence, pp. 1–6 (2019)

[10]

Doran, D., Schulz, S., Besold, T.R.: What does explainable AI really mean? A new conceptualization of perspectives (2017). arXiv:1710.00794

[11]

Doshi-Velez, F., Kim, B.: Towards a rigorous science of interpretable machine learning (2017). arXiv:1702.08608

[12]

Dosilovic, F.K., Brcic, M., Hlupic, N.: Explainable artificial intelligence: a survey. In: 2018 41st International Convention on Information and Communication Technology, Electronics and Microelectronics (MIPRO). IEEE (2018).

[13]

Du M, Liu N, and Hu X Techniques for interpretable machine learning Commun. ACM 2019 63 1 68-77

Digital Library

[14]

European Commission, Parliament: Regulation (EU) 2016/679 of the European parliament and of the council of 27 April 2016 on the protection of natural persons with regard to the processing of personal data and on the free movement of such data, and repealing Directive 95/46/EC (General Data Protection Regulation). OJ L 119, 1–88 (2016)

[15]

Fischer, L., Memmen, J.M., Veith, E.M., Tröschel, M.: Adversarial resilience learning–towards systemic vulnerability analysis for large and complex systems. In: The Ninth International Conference on Smart Grids, Green Communications and IT Energy-aware Technologies (ENERGY 2019), vol. 9, pp. 24–32 (2019)

[16]

Freitas AA Comprehensible classification models ACM SIGKDD Explor. Newsl. 2014 15 1 1-10

Digital Library

[17]

Fukuchi, Y., Osawa, M., Yamakawa, H., Imai, M.: Autonomous self-explanation of behavior for interactive reinforcement learning agents. In: Proceedings of the 5th International Conference on Human Agent Interaction - HAI 2017. ACM Press (2017)

[18]

Glass, A., McGuinness, D.L., Wolverton, M.: Toward establishing trust in adaptive agents. In: Proceedings of the 13th International Conference on Intelligent User Interfaces - IUI 2008. ACM Press (2008)

[19]

Goodman B and Flaxman S European union regulations on algorithmic decision-making and a “right to explanation” AI Mag. 2017 38 3 50-57

Digital Library

[20]

Halpern JY Causes and explanations: a structural-model approach. Part II: explanations Br. J. Philos. Sci. 2005 56 4 889-911

[21]

Hayes, B., Shah, J.A.: Improving robot controller transparency through autonomous policy explanation. In: Proceedings of the 2017 ACM/IEEE International Conference on Human-Robot Interaction - HRI 2017. ACM Press (2017)

[22]

Hein D, Hentschel A, Runkler T, and Udluft S Particle swarm optimization for generating interpretable fuzzy reinforcement learning policies Eng. Appl. Artif. Intell. 2017 65 87-98

Digital Library

[23]

Hein D, Udluft S, and Runkler TA Interpretable policies for reinforcement learning by genetic programming Eng. Appl. Artif. Intell. 2018 76 158-169

[24]

Herlocker, J.L., Konstan, J.A., Riedl, J.: Explaining collaborative filtering recommendations. In: Proceedings of the 2000 ACM Conference on Computer Supported Cooperative Work - CSCW 2000. ACM Press (2000)

[25]

Holzinger A, Carrington A, and Müller H Measuring the quality of explanations: the system causability scale (SCS) KI - Künstliche Intelligenz 2020 34 2 193-198

[26]

Holzinger, A., Langs, G., Denk, H., Zatloukal, K., Müller, H.: Causability and explainability of artificial intelligence in medicine. WIREs Data Min. Knowl. Disc. 9(4) (2019).

[27]

Ikonomovska E, Gama J, and Džeroski S Learning model trees from evolving data streams Data Min. Knowl. Disc. 2010 23 1 128-168

Digital Library

[28]

Israelsen BW and Ahmed NR “Dave...I can assure you ...that it’s going to be all right ...” a definition, case for, and survey of algorithmic assurances in human-autonomy trust relationships ACM Comput. Surv. 2019 51 6 1-37

Digital Library

[29]

Juozapaitis, Z., Koul, A., Fern, A., Erwig, M., Doshi-Velez, F.: Explainable reinforcement learning via reward decomposition. In: Proceedings of the IJCAI 2019 Workshop on Explainable Artificial Intelligence, pp. 47–53 (2019)

[30]

Kaelbling, L.P., Littman, M.L., Moore, A.W.: Reinforcement learning: a survey (1996). arXiv:cs/9605103

[31]

Kim, B., Khanna, R., Koyejo, O.O.: Examples are not enough, learn to criticize! criticism for interpretability. In: Lee, D.D., Sugiyama, M., Luxburg, U.V., Guyon, I., Garnett, R. (eds.) Advances in Neural Information Processing Systems 29, pp. 2280–2288. Curran Associates, Inc. (2016). http://papers.nips.cc/paper/6300-examples-are-not-enough-learn-to-criticize-criticism-for-interpretability.pdf

[32]

Lee JD and See KA Trust in automation: designing for appropriate reliance Hum. Fact. J. Hum. Fact. Ergon. Soc. 2004 46 1 50-80

[33]

Lee, J.H.: Complementary reinforcement learning towards explainable agents (2019). arXiv:1901.00188

[34]

Li, Y.: Deep reinforcement learning (2018). arXiv:1810.06339

[35]

Lipton, Z.C.: The mythos of model interpretability (2016). arXiv:1606.03490

[36]

Lipton ZC The mythos of model interpretability Commun. ACM 2018 61 10 36-43

Digital Library

[37]

Liu G, Schulte O, Zhu W, and Li Q Berlingerio M, Bonchi F, Gärtner T, Hurley N, and Ifrim G Toward interpretable deep reinforcement learning with linear model U-Trees Machine Learning and Knowledge Discovery in Databases 2019 Cham Springer 414-429

Digital Library

[38]

Liu, Y., et al.: Detecting cancer metastases on gigapixel pathology images (2017). arXiv:1703.02442

[39]

Loh WY Classification and regression trees WIREs Data Min. Knowl. Disc. 2011 1 1 14-23

[40]

Madumal, P., Miller, T., Sonenberg, L., Vetere, F.: Explainable reinforcement learning through a causal lens (2019). arXiv:1905.10958

[41]

Martens D, Vanthienen J, Verbeke W, and Baesens B Performance of classification models from a user perspective Decis. Support Syst. 2011 51 4 782-793

Digital Library

[42]

Miller T Explanation in artificial intelligence: insights from the social sciences Artif. Intell. 2019 267 1-38

[43]

Molar, C.: Interpretable machine learning (2018). https://christophm.github.io/interpretable-ml-book/. Accessed 31 Mar 2020

[44]

Montavon G, Samek W, and Müller KR Methods for interpreting and understanding deep neural networks Digit. Signal Proc. 2018 73 1-15

[45]

Nguyen, A., Yosinski, J., Clune, J.: Deep neural networks are easily fooled: high confidence predictions for unrecognizable images. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2015)

[46]

Nguyen, T.T., Hui, P.M., Harper, F.M., Terveen, L., Konstan, J.A.: Exploring the filter bubble. In: Proceedings of the 23rd International Conference on World Wide Web - WWW 2014. ACM Press (2014)

[47]

Quinlan, J.R., et al.: Learning with continuous classes. In: 5th Australian Joint Conference on Artificial Intelligence, vol. 92, pp. 343–348. World Scientific (1992)

[48]

Ribeiro, M.T., Singh, S., Guestrin, C.: “Why should I trust you?”. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining - KDD 2016. ACM Press (2016)

[49]

Rudin C Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead Nat. Mach. Intell. 2019 1 5 206-215

[50]

Rusu, A.A., et al.: Policy distillation (2015). arXiv:1511.06295

[51]

Schrittwieser, J., et al.: Mastering ATARI, go, chess and shogi by planning with a learned model (2019)

[52]

Sequeira, P., Gervasio, M.: Interestingness elements for explainable reinforcement learning: understanding agents’ capabilities and limitations (2019). arXiv:1912.09007

[53]

Shu, T., Xiong, C., Socher, R.: Hierarchical and interpretable skill acquisition in multi-task reinforcement learning (2017)

[54]

Szegedy, C., et al.: Intriguing properties of neural networks (2013). arXiv:1312.6199

[55]

The European Commission: Communication from the Commission to the European Parliament, the European Council, the Council, the European Economic and Social Committee and the Committee of the Regions. The European Commission (2018). https://ec.europa.eu/digital-single-market/en/news/communication-artificial-intelligence-europe. Article. Accessed 27 Mar 2020

[56]

The European Commission: Independent High-Level Expert Group on Artificial Intelligence set up by the European Commission. The European Commission (2018). https://ec.europa.eu/digital-single-market/en/news/communication-artificial-intelligence-europe. Article. Accessed 27 Apr 2020

[57]

Tomzcak, K., et al.: Let Tesla park your Tesla: driver trust in a semi-automated car. In: 2019 Systems and Information Engineering Design Symposium (SIEDS). IEEE (2019)

[58]

Uther, W.T., Veloso, M.M.: Tree based discretization for continuous state space reinforcement learning. In: AAAI/IAAI, pp. 769–774 (1998)

[59]

Veith, E., Fischer, L., Tröschel, M., Nieße, A.: Analyzing cyber-physical systems from the perspective of artificial intelligence. In: Proceedings of the 2019 International Conference on Artificial Intelligence, Robotics and Control. ACM (2019)

[60]

Veith EM Universal Smart Grid Agent for Distributed Power Generation Management 2017 Berlin Logos Verlag Berlin GmbH

Digital Library

[61]

Verma, A., Murali, V., Singh, R., Kohli, P., Chaudhuri, S.: Programmatically interpretable reinforcement learning. PMLR 80, 5045–5054 (2018). arXiv:1804.02477

[62]

van der Waa, J., van Diggelen, J., van den Bosch, K., Neerincx, M.: Contrastive explanations for reinforcement learning in terms of expected consequences. In: IJCAI 2018 Workshop on Explainable AI (XAI), vol. 37 (2018). arXiv:1807.08706

[63]

Wymann, B., Espié, E., Guionneau, C., Dimitrakakis, C., Coulom, R., Sumner, A.: TORCS, the open racing car simulator, vol. 4, no. 6, p. 2 (2000). Software http://torcs.sourceforge.net

[64]

Zahavy, T., Zrihem, N.B., Mannor, S.: Graying the black box: understanding DQNs (2016). arXiv:1602.02658

[65]

Zhou J and Chen F Human and Machine Learning: Visible, Explainable, Trustworthy and Transparent 2018 Cham Springer

[66]

Zhou, J., Chen, F.: Towards trustworthy human-AI teaming under uncertainty. In: IJCAI 2019 Workshop on Explainable AI (XAI) (2019)

[67]

Zhou J, Hu H, Li Z, Yu K, and Chen F Holzinger A, Kieseberg P, Tjoa AM, and Weippl E Physiological indicators for user trust in machine learning with influence enhanced fact-checking Machine Learning and Knowledge Extraction 2019 Cham Springer 94-113

Digital Library

Cited By

Zheng YHao QWang JGao CChen JJin DLi Y(2024)A Survey of Machine Learning for Urban Decision Making: Applications in Planning, Transportation, and HealthcareACM Computing Surveys10.1145/369598657:4(1-41)Online publication date: 22-Nov-2024
https://dl.acm.org/doi/10.1145/3695986
Metzger ALaufer JFeit FPohl K(2024)A User Study on Explainable Online Reinforcement Learning for Adaptive SystemsACM Transactions on Autonomous and Adaptive Systems10.1145/366600519:3(1-44)Online publication date: 30-Sep-2024
https://dl.acm.org/doi/10.1145/3666005
Schwalbe GFinzel B(2024)A comprehensive taxonomy for explainable artificial intelligence: a systematic survey of surveys on methods and conceptsData Mining and Knowledge Discovery10.1007/s10618-022-00867-838:5(3043-3101)Online publication date: 1-Sep-2024
https://dl.acm.org/doi/10.1007/s10618-022-00867-8
Show More Cited By

Index Terms

Explainable Reinforcement Learning: A Survey
1. Computing methodologies
  1. Artificial intelligence
    1. Distributed artificial intelligence
      1. Intelligent agents
    2. Knowledge representation and reasoning
  2. Machine learning
2. Theory of computation
  1. Logic
    1. Modal and temporal logics

Index terms have been assigned to the content through auto-classification.

Recommendations

Reward Shaping in Episodic Reinforcement Learning
AAMAS '17: Proceedings of the 16th Conference on Autonomous Agents and MultiAgent Systems

Recent advancements in reinforcement learning confirm that reinforcement learning techniques can solve large scale problems leading to high quality autonomous decision making. It is a matter of time until we will see large scale applications of ...
Using Transfer Learning to Speed-Up Reinforcement Learning: A Cased-Based Approach
LARS '10: Proceedings of the 2010 Latin American Robotics Symposium and Intelligent Robotics Meeting

Reinforcement Learning (RL) is a well-known technique for the solution of problems where agents need to act with success in an unknown environment, learning through trial and error. However, this technique is not efficient enough to be used in ...
Deep learning, reinforcement learning, and world models
Abstract
Deep learning (DL) and reinforcement learning (RL) methods seem to be a part of indispensable factors to achieve human-level or super-human AI systems. On the other hand, both DL and RL have strong connections with our brain functions ...

Comments

Information & Contributors

Information

Published In

cover image Guide Proceedings

Machine Learning and Knowledge Extraction: 4th IFIP TC 5, TC 12, WG 8.4, WG 8.9, WG 12.9 International Cross-Domain Conference, CD-MAKE 2020, Dublin, Ireland, August 25–28, 2020, Proceedings

Aug 2020

535 pages

ISBN:978-3-030-57320-1

DOI:10.1007/978-3-030-57321-8

Editors:
Andreas Holzinger
Human-Centered AI Lab, Institute for Medical Informatics, Statistics and Doumentation, Medical University Graz, Graz, Austria
,
Peter Kieseberg
UAS St. Pölten, St. Pölten, Austria
,
A Min Tjoa
Institute of Software Technology and Interactive Systems, Technical University of Vienna, Vienna, Austria
,
Edgar Weippl
SBA Research, Vienna, Austria

© IFIP International Federation for Information Processing 2020.

Publisher

Springer-Verlag

Berlin, Heidelberg

Publication History

Published: 25 August 2020

Author Tags

Qualifiers

Article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

4
Total Citations
View Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 06 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

Zheng YHao QWang JGao CChen JJin DLi Y(2024)A Survey of Machine Learning for Urban Decision Making: Applications in Planning, Transportation, and HealthcareACM Computing Surveys10.1145/369598657:4(1-41)Online publication date: 22-Nov-2024
https://dl.acm.org/doi/10.1145/3695986
Metzger ALaufer JFeit FPohl K(2024)A User Study on Explainable Online Reinforcement Learning for Adaptive SystemsACM Transactions on Autonomous and Adaptive Systems10.1145/366600519:3(1-44)Online publication date: 30-Sep-2024
https://dl.acm.org/doi/10.1145/3666005
Schwalbe GFinzel B(2024)A comprehensive taxonomy for explainable artificial intelligence: a systematic survey of surveys on methods and conceptsData Mining and Knowledge Discovery10.1007/s10618-022-00867-838:5(3043-3101)Online publication date: 1-Sep-2024
https://dl.acm.org/doi/10.1007/s10618-022-00867-8
Towers MDu YFreeman CNorman T(2024)Temporal Explanations of Deep Reinforcement Learning AgentsExplainable and Transparent AI and Multi-Agent Systems10.1007/978-3-031-70074-3_6(99-115)Online publication date: 6-May-2024
https://dl.acm.org/doi/10.1007/978-3-031-70074-3_6

View Options

View options

Media

Figures

Other

Tables

View Table of Contents