skip to main content
research-article

RetainVis: Visual Analytics with Interpretable and Interactive Recurrent Neural Networks on Electronic Medical Records

Published: 01 January 2019 Publication History

Abstract

We have recently seen many successful applications of recurrent neural networks (RNNs) on electronic medical records (EMRs), which contain histories of patients' diagnoses, medications, and other various events, in order to predict the current and future states of patients. Despite the strong performance of RNNs, it is often challenging for users to understand why the model makes a particular prediction. Such black-box nature of RNNs can impede its wide adoption in clinical practice. Furthermore, we have no established methods to interactively leverage users' domain expertise and prior knowledge as inputs for steering the model. Therefore, our design study aims to provide a visual analytics solution to increase interpretability and interactivity of RNNs via a joint effort of medical experts, artificial intelligence scientists, and visual analytics researchers. Following the iterative design process between the experts, we design, implement, and evaluate a visual analytics tool called RetainVis, which couples a newly improved, interpretable, and interactive RNN-based model called RetainEX and visualizations for users' exploration of EMR data in the context of prediction tasks. Our study shows the effective use of RetainVis for gaining insights into how individual medical codes contribute to making risk predictions, using EMRs of patients with heart failure and cataract symptoms. Our study also demonstrates how we made substantial changes to the state-of-the-art RNN model called RETAIN in order to make use of temporal information and increase interactivity. This study will provide a useful guideline for researchers that aim to design an interpretable and interactive visual analytics tool for RNNs.

References

[1]
T. Arajo, G. Aresta, E. Castro, J. Rouco, P. Aguiar, C. Eloy, A. Polnia, and A. Campilho. Classification of breast cancer histology images using convolutional neural networks. PLOS ONE, 12 (6) pp. 1–14, 06 2017.
[2]
M. Aupetit, N. Heulot, and J.-D. Fekete. “A multidimensional brush for scatterplot data analytics”. In Visual Analytics Science and Technology (VAST), pp. 221–222. IEEE, Oct. 2014.
[3]
D. Bahdanau, K. Cho, and Y. Bengio. Neural machine translation by jointly learning to align and translate. In International Conference on Learning Representations, 2015.
[4]
I.M. Baytas, C. Xiao, X. Zhang, F. Wang, A.K. Jain, and J. Zhou. Patient subtyping via time-aware lstm networks. In Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 65–74, 2017.
[5]
L. Caplan, P. Gorelick, and D. Hier. Race, Race, sex and occlusive cerebrovascular disease: a review. Stroke, 17 (4) pp. 648–655, 1986.
[6]
M. Cavallo and Ç. Demiralp. Exploring dimensionality reductions with forward and backward projections. arXiv preprint arXiv:, 2017.
[7]
H. Chae, J. Lee, E.S. Jeon, and J.K. Kim. Personalized acupuncture treatment with sasang typology. Integrative Medicine Research, 6 (4) pp. 329–336, 2017.
[8]
Z. Chati, F. Zannad, C. Jeandel, B. Lherbier, J.-M. Escanye, J. Robert, and E. Aliot. Physical deconditioning may be a mechanism for the skeletal muscle energy phosphate metabolism abnormalities in chronic heart failure. American Heart Journal, 131 (3) pp. 560–566, 1996.
[9]
Z. Che, D. Kale, W. Li, M.T. Bahadori, and Y. Liu. Deep computational phenotyping. In Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 507–516, 2015.
[10]
K. Cho, B. van Merrienboer, Ç. Gülçehre, D Bahdanau, F. Bougares, H. Schwenk, and Y. Bengio. Learning phrase representations using RNN encoder-decoder for statistical machine translation. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing, pp. 1724–1734, 2014.
[11]
N.H. Cho, J.Y. Kim, S.S. Kim, and C. Shin. The relationship of metabolic syndrome and constitutional medicine for the prediction of cardiovascular disease. Diabetes & Metabolic Syndrome: Clinical Research & Reviews, 7 (4) pp. 226–232, 2013.
[12]
E. Choi, M.T. Bahadori, A. Schuetz, W.F. Stewart, and J. Sun. Doctor ai: Predicting clinical events via recurrent neural networks. In Proceedings of the 1st Machine Learning for Healthcare Conference, vol. 56, pp. 301–318, 2016.
[13]
E. Choi, M.T. Bahadori, E. Searles, C. Coffey, M. Thompson, J. Bost, J. Tejedor-Sojo, and J. Sun. Multi-layer representation learning for medical concepts. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1495–1504, 2016.
[14]
E. Choi, M.T. Bahadori, L. Song, W.F. Stewart, and J. Sun. GRAM: graph-based attention model for healthcare representation learning. In Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 787–795, 2017.
[15]
E. Choi, M.T. Bahadori, J. Sun, J. Kulas, A. Schuetz, and W. Stewart. Retain: An interpretable predictive model for healthcare using reverse time attention mechanism. In Advances in Neural Information Processing Systems 29, pp. 3504–3512. Curran Associates, Inc., 2016.
[16]
E. Choi, A. Schuetz, W.F. Stewart, and J. Sun. Using recurrent neural network models for early detection of heart failure onset. Journal of the American Medical Informatics Association, 24 (2) pp. 361–370, 2017.
[17]
J. Choo, C. Lee, C.K. Reddy, and H. Park. Utopian: User-driven topic modeling based on interactive nonnegative matrix factorization. IEEE Transactions on Visualization and Computer Graphics, 19 (12) pp. 1992–2001, Dec 2013.
[18]
J. Choo, C. Lee, C.K. Reddy, and H. Park. Weakly supervised nonnegative matrix factorization for user-driven clustering. Data Mining and Knowledge Discovery, 29 (6) pp. 1598–1621, Nov 2015.
[19]
S. Chung, C. Park, S. Suh, K. Kang, J. Choo, and B.C. Kwon. Re-VACNN: Steering convolutional neural network via real-time visual analytics. In Future of Interactive Learning Machines Workshop at the 30th Annual Conference on Neural Information Processing Systems, 2016.
[20]
J. Cleland, I. Findlay, S. Jafri, G. Sutton, R. Falk, C. Bulpitt, C. Prentice, I. Ford, A. Trainer, and P. Poole-Wilson. The warfarin/aspirin study in heart failure (wash): a randomized trial comparing antithrombotic strategies for patients with heart failure. American Heart Journal, 148 (1) pp. 157–164, 2004.
[21]
J.N. Cohn, M.B. Fowler, M.R. Bristow, W.S. Colucci, E.M. Gilbert, V. Kinhal, S.K. Krueger, T. Lejemtel, K.A. Narahara, M. Packer et al., Safety and efficacy of carvedilol in severe heart failure. Journal of Cardiac Failure, 3 (3) pp. 173–179, 1997.
[22]
P. De Groote, P. Delour, N. Lamblin, J. Dagorn, C. Verkindere, E. Tison, A. Millaire, and C. Bauters. Effects of bisoprolol in patients with stable congestive heart failure. Annales de Cardiologie et d'Angeiologie, 53 (4) pp. 167–170, 2004.
[23]
Y. Ding, Y. Liu, H. Luan, and M. Sun. Visualizing and understanding neural machine translation. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, 2017.
[24]
H.R. Ehrenberg, J. Shin, A.J. Ratner, J.A. Fries, and C. Ré. Data programming with ddlite: Putting humans in a different part of the loop. In Proceedings of the Workshop on Human-In-the-Loop Data Analytics, pp. 13:1–13:6. ACM, 2016.
[25]
M. El-Assady, R. Sevastjanova, F. Sperrle, D. Keim, and C. Collins. Progressive learning of topic modeling parameters: A visual analytics framework. IEEE Transactions on Visualization and Computer Graphics, 24 (1) pp. 382–391, Jan 2018.
[26]
T. Fawcett. Roc graphs: Notes and practical considerations for researchers. Machine learning, 31 (1) pp. 1–38, 2004.
[27]
J.H. Friedman. Greedy function approximation: a gradient boosting machine. Annals of statistics, pp. 1189–1232, 2001.
[28]
Z. Han, B. Wei, Y. Zheng, Y. Yin, K. Li, and S. Li. Breast cancer multi-classification from histopathological images with structured deep learning model. Scientific Reports, 7 (1) 4172, 2017.
[29]
M. Havaei, A. Davy, D. Warde-Farley, A. Biard, A. Courville, Y. Bengio, C. Pal, P.-M. Jodoin, and H. Larochelle. Brain tumor segmentation with deep neural networks. Medical Image Analysis, 35: pp. 18–31, 2017.
[30]
F. Heimerl, S. Koch, H. Bosch, and T. Ertl. Visual classifier training for text document retrieval. IEEE Transactions on Visualization and Computer Graphics, 18 (12) pp. 2839–2848, Dec 2012.
[31]
N. Heulot, M. Aupetit, and J.-D. Fekete. ProxiLens: Interactive Exploration of High-Dimensional Data using Projections. In VAMP: EuroVis Workshop on Visual Analytics using Multidimensional Projections. The Eurographics Association, June 2013.
[32]
S. Hochreiter and J. Schmidhuber. Long short-term memory. Neural Computation, 9 (8) pp. 1735–1780, Nov 1997.
[33]
F. Hohman, N.O. Hodas, and D.H. Chau. Shapeshop: Towards understanding deep learning representations via interactive experimentation. In Proceedings of the 2017 CHI Conference on Human Factors in Computing Systems, pp. 1694–1699, 2017.
[34]
B. Jin, C. Che, Z. Liu, S. Zhang, X. Yin, and X. Wei. Predicting the risk of heart failure with ehr sequential data modeling. IEEE Access, 6: pp. 9256–9261, 2018.
[35]
M. Kahng, P.Y. Andrews, A. Kalro, and D.H. PChau. Activis: Visual exploration of industry-scale deep neural network models. IEEE Transactions on Visualization and Computer Graphics, 24 (1) pp. 88–97, 2018.
[36]
D.C. Kale, Z. Che, M.T. Bahadori, W. Li, Y. Liu, and R.C. Wetzel. Causal phenotype discovery via deep networks. In American Medical Informatics Association Annual Symposium, 2015.
[37]
K. Kamnitsas, C. Ledig, V.F.J Newcombe, J.P. Simpson, A.D. Kane, D.K. Menon, D. Rueckert, and B. Glocker. Efficient multi-scale 3d cnn with fully connected crf for accurate brain lesion segmentation. Medical Image Analysis, 36: pp. 61–78, 2017.
[38]
W.B. Kannel, W.P. Castelli, P.M. McNamara, P.A. McKee, and M. Feinleib. Role of blood pressure in the development of congestive heart failure: the framingham study. New England Journal of Medicine, 287 (16) pp. 781–787, 1972.
[39]
A. Karpathy, J. Johnson, and F. Li. Visualizing and understanding recurrent networks. CoRR, abs/1506.02078, 2015.
[40]
S. Kenchaiah, J.C. Evans, D. Levy, P.W. Wilson, E.J. Benjamin, M.G. Larson, W.B. Kannel, and R.S. Vasan. Obesity and the risk of heart failure. New England Journal of Medicine, 347 (5) pp. 305–313, 2002.
[41]
L. Kim, J.-A. Kim, and S. Kim. A guide for the utilization of health insurance review and assessment service national patient samples. Epidemiology and Health, 36: e2014008, 2014.
[42]
L. Kim, J. Sakong, Y. Kim, S. Kim, S. Kim, B. Tchoe, H. Jeong, and T. Lee. Developing the inpatient sample for the national health insurance claims data. Health Policy and Management, 23 (2) pp. 152–161, Jun 2013.
[43]
D.P. Kingma and J. Ba. Adam: A method for stochastic optimization. In International Conference on Learning Representations, 2015.
[44]
I. Klein and S. Danzi. Thyroid disease and the heart. Circulation, 116 (15) pp. 1725–1735, 2007.
[45]
B.C. Kwon, B. Eysenbach, J. Verma, K. Ng, C.D. Filippi, W.F. Stewart, and A. Perer. Clustervision: Visual supervision of unsupervised clustering. IEEE Transactions on Visualization and Computer Graphics, 24 (1) pp. 142–151, Jan 2018.
[46]
B.C. Kwon, H. Kim, E. Wall, J. Choo, H. Park, and A. Endert. Axisketcher: Interactive nonlinear axis mapping of visualizations through user drawings. IEEE Transactions on Visualization and Computer Graphics, 23 (1) pp. 221–230, Jan 2017.
[47]
B.C. Kwon, S.-H. Kim, S. Lee, J. Choo, J. Huh, and J.S. Yi. Visohc: Designing visual analytics for online health communities. IEEE Transactions on Visualization and Computer Graphics, 22 (1) pp. 71–80, 2016.
[48]
B.C. Kwon, J. Verma, and A. Perer. Peekquence: Visual analytics for event sequence data. in ACM SIGKDD Workshop on Interactive Data Exploration and Analytics, 2016.
[49]
H. Lee, J. Kihm, J. Choo, J.T. Stasko, and H. Park. ivisclustering: An interactive visual document clustering via topic modeling. Computer Graphics Forum, 31 (3) pp. 1155–1164, 2012.
[50]
T.Y. Lee, A. Smith, K. Seppi, N. Elmqvist, J. Boyd-Graber, and L. Find-later. The human touch: How non-expert users perceive, interpret, and fix topic models. International Journal of Human-Computer Studies, 105: 28–42, 2017.
[51]
S. Lespinats and M. Aupetit. CheckViz: Sanity Check and Topological Clues for Linear and Non-Linear Mappings. Computer Graphics Forum, 30 (1) pp. 113–125, 2010.
[52]
D. Levy, M.G. Larson, R.S. Vasan, W.B. Kannel, and K.K. Ho. The progression from hypertension to congestive heart failure. Journal of the American Medical Association, 275 (20) pp. 1557–1562, 1996.
[53]
H. Lin, S. Gao, D. Gotz, F. Du, J. He, and N. Cao. Rclens: Interactive rare category exploration and identification. IEEE Transactions on Visualization and Computer Graphics, PP (99): pp. 1–1, 2018.
[54]
Z.C. Lipton. The mythos of model interpretability. arXiv preprint arXiv:, 2016.
[55]
Z.C. Lipton, D.C. Kale, C. Elkan, and R.C. Wetzel. Learning to diagnose with lstm recurrent neural networks. in International Conference on Learning Representations, 2015.
[56]
Z.C. Lipton, D.C. Kale, and R.C. Wetzel. Phenotyping of clinical time series with LSTM recurrent neural networks. In Workshop on Machine Learning in Healthcare at the 29th Annual Conference on Neural Information Processing Systems, 2015.
[57]
M. Liu, J. Shi, Z. Li, C. Li, J. Zhu, and S. Liu. Towards better analysis of deep convolutional neural networks. IEEE Transactions on Visualization and Computer Graphics, 23 (1) pp. 91–100, 2017.
[58]
T. Luong, H. Pham, and C.D. Manning. Effective approaches to attention-based neural machine translation. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 1412–1421, 2015.
[59]
F. Ma, R. Chitta, J. Zhou, Q. You, T. Sun, and J. Gao. Dipole: Diagnosis prediction in healthcare via attention-based bidirectional recurrent neural networks. In Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1903–1911, 2017.
[60]
L.V.D Maaten and G. Hinton. Visualizing data using t-sne. JMLR, 9(Nov), 2008.
[61]
Y. Ming, S. Cao, R. Zhang, Z. Li, Y. Chen, Y. Song, and H. Qu. Understanding hidden memories of recurrent neural networks. In IEEE Conference on Visual Analytics Science and Technology, 2017.
[62]
M. Ozbaran, S.B. Omay, S. Nalbantgil, H. Kultursay, K. Kumanlioglu, D. Nart, and E. Pektok. Autologous peripheral stem cell transplantation in patients with congestive heart failure due to ischemic heart disease. European Journal of Cardio-Thoracic Surgery, 25 (3) pp. 342–350, 2004.
[63]
A. Paszke, S. Gross, S. Chintala, G. Chanan, E. Yang, Z. DeVito, Z. Lin, A. Desmaison, L. Antiga, and A. Lerer. Automatic differentiation in pytorch. In The Future of Gradient-based Machine Learning Software and Techniques Workshop at the 31st Annual Conference on Neural Information Processing Systems, 2017.
[64]
N. Pezzotti, T. Hllt, J.V. Gemert, B.P.F Lelieveldt, E. Eisemann, and A. Vilanova. Deepeyes: Progressive visual analytics for designing deep neural networks. IEEE Transactions on Visualization and Computer Graphics, 24 (1) pp. 98–108, Jan 2018.
[65]
A. Prakash, S. Zhao, S.A. Hasan, V.V. Datla, K. Lee, A. Qadir, J. Liu, and O. Farri. Condensed memory networks for clinical diagnostic inferencing. In Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence, pp. 3274–3280, 2017.
[66]
P.E. Rauber, S.G. Fadel, A.X. Falco, and A.C. Telea. Visualizing the hidden activity of artificial neural networks. IEEE Transactions on Visualization and Computer Graphics, 23 (1) pp. 101–110, Jan 2017.
[67]
M.T. Ribeiro, S. Singh, and C. Guestrin. Why should i trust you?: Explaining the predictions of any classifier. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1135–1144. ACM, 2016.
[68]
D. Sacha, M. Sedlmair, L. Zhang, J.A. Lee, J. Peltonen, D. Weiskopf, S.C. North, and D.A. Keim. What you see is what you can change: Human-centered machine learning by interactive visualization. Neurocomputing, 268: pp. 164–175, 2017.
[69]
D. Sacha, H. Senaratne, B.C. Kwon, G. Ellis, and D.A. Keim. The RoleOf Uncertainty, Awareness, And TrustIn Visual Analytics. Visualization and Computer Graphics, IEEE Transactions on, 22 (1) pp. 240–249, 2016.
[70]
M. Sedlmair, M. Meyer, and T. Munzner. Design study methodology: Reflections from the trenches and the stacks. IEEE Transactions on Visualization and Computer Graphics, 18 (12) pp. 2431–2440, 2012.
[71]
L.S. Shapley. A value for n-person games. Contributions to the Theory of Games, 2 (28) pp. 307–317, 1953.
[72]
K. Shimada, A. Kawamoto, K. Matsubayashi, and T. Ozawa. Silent cerebrovascular disease in the elderly. correlation with ambulatory pressure. Hypertension, 16 (6) pp. 692–699, 1990.
[73]
S. Simon, S. Mittelstädt, D.A. Keim, and M. Sedlmair. Bridging the gap of domain and visualization experts with a liaison. In Eurographics Conference on Visualization 2015, pp. 127–131, 2015.
[74]
D. Smilkov, S. Carter, D. Sculley, F.B. Viégas, and M. Wattenberg. Direct-manipulation visualization of deep networks. In International Conference on Machine Learning, 2016.
[75]
H. Strobelt, S. Gehrmann, H. Pfister, and A.M. Rush. Lstmvis: A tool for visual analysis of hidden state dynamics in recurrent neural networks. IEEE Transactions on Visualization and Computer Graphics, 24 (1) pp. 667–676, 2018.
[76]
Q. Suo, F. Ma, G. Canino, J. Gao, A. Zhang, P. Veltri, and A. Gnasso. A multi-task framework for monitoring health conditions via attention-based recurrent neural networks. In American Medical Informatics Association Annual Symposium, 2017.
[77]
Y. Takeda, Y. Takeda, S. Tomimoto, T. Tani, H. Narita, and G. Kimura. Bilirubin as a prognostic marker in patients with pulmonary arterial hypertension. BMC Pulmonary Medicine, 10 (1) 22, 2010.
[78]
P. Tsibouris, M.T. Hendrickse, P. Mavrogianni, and P.E. Isaacs. Ischemic heart disease, factor predisposing to barretts adenocarcinoma: A case control study. World Journal of Gastrointestinal Pharmacology and Therapeutics, 5 (3) 183, 2014.
[79]
F. Wang, H. Liu, and J. Cheng. Visualizing deep neural network by alternately image blurring and deblurring. Neural Networks, 97: pp. 162–172, 2018.
[80]
W. Wang, S. Xu, and B. Xu. First step towards end-to-end parametric TTS synthesis: Generating spectral parameters with neural attention. In Interspeech2016, 17th Annual Conference of the International Speech Communication Association, San Francisco, CA, USA, September 8–12,2016, pp. 2243–2247, 2016. https://doi.org/10.21437/Interspeech.2016-134.
[81]
Y. Wang, Z. Luo, and P.-M. Jodoin. Interactive deep learning method for segmenting moving objects. Pattern Recognition Letters, 96: pp. 66–75, 2017.
[82]
K. Wongsuphasawat, D. Smilkov, J. Wexler, J. Wilson, D. Mané, D. Fritz, D. Krishnan, F.B. Viégas, and M. Wattenberg. Visualizing dataflow graphs of deep learning models in tensorflow. IEEE Transactions on Visualization and Computer Graphics, 24 (1) pp. 1–12, 2018.
[83]
H. Xu and K. Saenko. Ask, attend and answer: Exploring question-guided spatial attention for visual question answering. In B Leibe, J. Matas, N. Sebe, and M. Welling Eds., Computer Vision - ECCV 2016, pp. 451–466. Springer International Publishing, Cham, 2016.
[84]
C. Yan, Y. Chen, B. Li, D. Liebovitz, and B. Malin. Learning clinical workflows to identify subgroups of heart failure patients. In AMIA Annual Symposium Proceedings, vol. 2016, p. 1248, 2016.
[85]
L.M. Zintgraf, T.S. Cohen, T. Adel, and M. Welling. Visualizing deep neural network decisions: Prediction difference analysis. In International Conference on Learning Representations, 2017.

Cited By

View all

Index Terms

  1. RetainVis: Visual Analytics with Interpretable and Interactive Recurrent Neural Networks on Electronic Medical Records
        Index terms have been assigned to the content through auto-classification.

        Recommendations

        Comments

        Information & Contributors

        Information

        Published In

        cover image IEEE Transactions on Visualization and Computer Graphics
        IEEE Transactions on Visualization and Computer Graphics  Volume 25, Issue 1
        Jan. 2019
        1266 pages

        Publisher

        IEEE Educational Activities Department

        United States

        Publication History

        Published: 01 January 2019

        Qualifiers

        • Research-article

        Contributors

        Other Metrics

        Bibliometrics & Citations

        Bibliometrics

        Article Metrics

        • Downloads (Last 12 months)0
        • Downloads (Last 6 weeks)0
        Reflects downloads up to 29 Jan 2025

        Other Metrics

        Citations

        Cited By

        View all

        View Options

        View options

        Figures

        Tables

        Media

        Share

        Share

        Share this Publication link

        Share on social media