research-article

ICD-9 Tagging of Clinical Notes Using Topical Word Embedding

Authors:

Mary Jane C. Samonte,

Bobby D. Gerardo,

Arnel C. Fajardo,

Ruji P. MedinaAuthors Info & Claims

ICIEB '18: Proceedings of the 2018 1st International Conference on Internet and e-Business

Pages 118 - 123

https://doi.org/10.1145/3230348.3230357

Published: 25 April 2018 Publication History

Abstract

Medical records, which contains text, has been dramatically increasing everyday. This means that there is a greater need of analyzing health information in a better way. And this can be done through document classification in natural language applications. In this study, we describe tagging of patient notes with ICD-9 codes through topical word embedding in deep learning called EnHANs. We formulate this paper as a multi-label, multi-class classification problem to categorize the ICD-9 codes of a dataset with 400,000 critical care unit medical records. Knowing accurate diagnosis using ICD-9 codes is a vital information for billing and insurance claims. We demonstrate that through the use of topical word embedding model, we learn to classify patient notes with their corresponding ICD-9 labels moderately well than single-label classification.

References

[1]

L. Lenc, and P. Král, Deep neural networks for czech multi-label document classification, In Proceedings of the 17th International Conference on Intelligent Text Processing and Computational Linguistics, arXiv preprint arXiv: 1701.03849, 2016.

[2]

W. Cleverley, and J. Cleverley, Essentials of health care finance (8th Edition), Jones and Bartlett Learning, Burlington, Massachusetts, USA, 2017.

[3]

S. Monteith, T. Glenn, J. Geddes, and M. Bauer, Big data are coming to psychiatry: A general introduction, International Journal of Bipolar Disorders, 3, 1, 21, 2015.

[4]

R. Escorpizo, N. Kostanjsek, C. Kennedy, M. Nicol, G. Stucki, and T. Ustün, Harmonizing WHO's international classification of diseases (ICD) and international classification of functioning, disability and health (ICF): Importance and methods to link disease and functioning, BMC Public Health, 13, 1, 742, 2013.

[5]

S. Feder, N. Redeker, S. Jeon, D. Schulman-Green, J.A. Womack, J. Tate, R. Bedimo, M. Budoff, A. Butt, K. Crothers, and K. Akgün, Validation of the ICD-9 diagnostic code for palliative care in patients hospitalized with heart failure within the veterans health administration, American Journal of Hospice and Palliative Medicine, 1049909117747519, 2017.

[6]

Zhang, M., and Zhi-Hua Z. 2014. A review on multi-label learning algorithms, In Proceedings of the IEEE Transactions on Knowledge and Data Engineering, 2, 6:8, 1819--1837.

[7]

Joulin, A., Grave, E., Bojanowski, P., Douze, M., Jégou, H. and Mikolov, T. 2016. Fasttext. zip: Compressing text classification models, arXiv preprint arXiv:1612.03651.

[8]

Kavuluru, R., Rios, A., and Lu, Y., 2015. An empirical evaluation of supervised learning approaches in assigning diagnosis codes to electronic medical records. In Journal in Artificial Intelligence in Medicine, 65, 2, 155--166.

Digital Library

[9]

Baumel, T., Nassour-Kassis, J., Elhadad, M., and Elhadad, N., 2017. Multi-label classification of patient notes a case study on ICD code assignment, arXiv preprint arXiv: 1709.09587.

[10]

Demner-Fushman, D., and Elhadad, N. 2016. Aspiring to unintended consequences of natural language processing: A review of recent developments in clinical and consumer-generated text processing, Yearbook of Medical Informatics, 1, 224.

[11]

Yang, Z., Yang, D., Dyer, C., He, X., Smola, A., and Hovy, E. 2016. Hierarchical attention networks for document classification, In Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 1480--1489.

[12]

Y. Liu, Z. Liu, T.S. Chua, and M. Sun, Topical word embeddings, Proc Twenty-Ninth AAAI Conference on Artificial Intelligence AAAI, 2418--2424, 2015.

Digital Library

[13]

Johnson, A., Pollard, T., Shen, L., Lehman, L., Feng, M., Ghassemi, M., and Mark, R. 2016. MIMIC-III, A freely accessible critical care database, Scientific Data, 3, 160035.

[14]

Chollet, F. 2017. Deep Learning with Python, Manning Publications.

Digital Library

[15]

Shickel, B., Tighe, P., Bihorac, A., and Rashidi, P. 2017. Deep EHR: A survey of recent advances in deep learning techniques for electronic health record (EHR) analysis, In IEEE Journal of Biomedical and Health Informatics.

[16]

Mennemeyer, S., Menachemi, N., Rahurkar, S., and Ford, E., 2016. Impact of the HITECH act on physicians' adoption of electronic health records, In Journal of the American Medical Informatics Association, 23, 2, 375--379.

[17]

DeSalvo, K., Dinkler, A., and Stevens, L. 2015. The US office of the national coordinator for health information technology: Progress and promise for the future at the 10-year mark, Annals of Emergency Medicine, 66, 5, 507--510.

[18]

Johnson, A., Altmark, R., Weinstein, M., Pitt, H., Yeo, C., and Cowan, S. 2017. Predicting the risk of postoperative respiratory failure in elective abdominal and vascular operations using the national surgical quality improvement program (NSQIP) participant use data file. Annals of Surgery, 266, 6, 968--974.

[19]

Kaufman, B., Spivack, B., Stearns, S., Song, P., and O'Brien, E. 2017. Impact of accountable care organizations on utilization, care, and outcomes: A systematic review, Medical Care Research and Review, 1077558717745916.

[20]

Zheng, K., Gao, J., Ngiam, K., Ooi, B. and Yip, W. 2017. Resolving the bias in electronic medical records, In Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2171--2180.

Digital Library

[21]

Babbar, R., and Schölkopf, B., 2017. DiSMEC: Distributed sparse machines for extreme multi-label classification, In Proceedings of the Tenth ACM International Conference on Web Search and Data Mining, 721--729.

Digital Library

[22]

Feremans, L., Cule, B., Vens, C., and Goethals, B. 2017. Combining Instance and Feature neighbors for Efficient Multi-label Classification.

[23]

Prabhu, Y., Kag, A., Gopinath, S., Dahiya, K., Harsola, S., Agrawal, R., and Varma, M. 2018. Extreme Multi-label Learning with Label Features for Warm-start Tagging, Ranking & Recommendation.

[24]

Zheng, T., Xie, W., Xu, L., He, X., Zhang, Y., You, M., Yang, G., and Chen, Y. 2017. A machine learning-based framework to identify type 2 diabetes through electronic health record, In International Journal of Medical Informatics, 1, 97, 120--7.

[25]

Zheng, T., Xie, W., Xu, L., He, X., Zhang, Y., You, M., Yang, G., and Chen, Y. 2017. A machine learning-based framework to identify type 2 diabetes through electronic health record, In International Journal of Medical Informatics, 1, 97, 120--7

[26]

Yu, L., Hermann, K. M., Blunsom, P., and Pulman, S. 2014. Deep learning for answer sentence selection, NIPS Deep Learning Workshop, arXiv preprint arXiv:1412.1632.

[27]

Such, F., Madhavan, V., Conti, E., Lehman, J., Stanley, K., and Clune, J. 2017. Deep neuroevolution: Genetic algorithms are a competitive alternative for training deep neural networks for reinforcement learning, arXiv preprint arXiv:1712.06567.

[28]

Sproat, R., and Jaitly, N. 2016. RNN approaches to text normalization: A challenge, arXiv preprint arXiv:1611.00068.

[29]

Rao, K., Peng, F., Sak, H., and Beaufays, F. 2015. Grapheme-to-phoneme conversion using long short-term mermory recurrent neural networks, In Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing, 4225--4229.

[30]

Liu, P., Qiu, X., and Huang, X. 2015. Learning context-sensitive word embeddings with neural tensor skip-gram model, In Proceedings of the International Joint Conference on Artificial Intelligence (IJCAI2015), 1284--1290.

Digital Library

[31]

Allahyari, M., Pouriyeh, S., Assefi, M., Safaei, S., Trippe, E.D., Gutierrez, J.B. and Kochut, K., 2017. Text summarization techniques: A brief survey, arXiv preprint arXiv: 1707.02268.

[32]

Sandfort, V., Johnson, A., Kunz, L., Vargas, J., and Rosing, D. 2018. Prolonged elevated heart rate and 90-Day survival in acutely Ill patients: Data from the MIMIC-III database, In Journal of Intensive Care Medicine, 0885066618756828.

[33]

Johnson, A., Stone, D., Celi, L., and Pollard, T., 2017. The MIMIC code repository: Enabling reproducibility in critical care research, In Journal of the American Medical Informatics Association, 25, 1, 32--39.

[34]

Yu, Z., Ramanarayanan, V., Suendermann-Oeft, D., Wang, X., Zechner, K., Chen, L., and Qian, Y. 2015. Using bidirectional LSTM recurrent neural networks to learn high-level abstractions of sequential features for automated scoring of non-native spontaneous speech, In Proceedings of the 2015 IEEE Workshop on Automatic Speech Recognition and Understanding, ASRU 2015, 338--345.

Cited By

Uma KFrancis SSun WMoens M(2024)Towards Explainability in Automated Medical Code Prediction from Clinical RecordsIntelligent Systems and Applications10.1007/978-3-031-47718-8_40(593-637)Online publication date: 14-Feb-2024
https://doi.org/10.1007/978-3-031-47718-8_40
Masud JKuo CYeh CYang HLin M(2023)Applying Deep Learning Model to Predict Diagnosis Code of Medical RecordsDiagnostics10.3390/diagnostics1313229713:13(2297)Online publication date: 6-Jul-2023
https://doi.org/10.3390/diagnostics13132297
Sarwar TSeifollahi SChan JZhang XAksakalli VHudson IVerspoor KCavedon L(2023)The Secondary Use of Electronic Health Records for Data Mining: Data Characteristics and ChallengesACM Computing Surveys10.1145/349023455:2(1-40)Online publication date: 31-Mar-2023
https://doi.org/10.1145/3490234
Show More Cited By

Index Terms

ICD-9 Tagging of Clinical Notes Using Topical Word Embedding
1. Computing methodologies
  1. Artificial intelligence
    1. Natural language processing
      1. Information extraction

Recommendations

Towards enhanced hierarchical attention networks in ICD-9 tagging of clinical notes
ICCIP '17: Proceedings of the 3rd International Conference on Communication and Information Processing

Text is an important element in document classification in many natural language applications. Natural language processing (NLP) is today's computational advancement that provides many significant modern uses of text documents such as efficient ...
An empirical evaluation of deep learning for ICD-9 code assignment using MIMIC-III clinical notes
Highlights
- Developed deep learning-based algorithms to map clinical notes to ICD-9 medical codes automatically.
Abstract Background and Objective
Code assignment is of paramount importance in many levels in modern hospitals, from ensuring accurate billing process to creating a valid record of patient care history. However, the coding process ...
Hospital Readmission Prediction Using Clinical Admission Notes
ACSW '22: Proceedings of the 2022 Australasian Computer Science Week

Clinical notes contain contextualised information beyond structured data relating to patients’ past and current health conditions. Despite the richness, their unstructured, long, and high dimensional nature presents challenges to traditional text ...

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences

ICIEB '18: Proceedings of the 2018 1st International Conference on Internet and e-Business

April 2018

389 pages

ISBN:9781450363754

DOI:10.1145/3230348

Copyright © 2018 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

In-Cooperation

Wuhan Univ.: Wuhan University, China
City University of Hong Kong: City University of Hong Kong

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 25 April 2018

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed limited

Conference

ICIEB '18

ICIEB '18: 2018 International Conference on Internet and e-Business

April 25 - 27, 2018

Singapore, Singapore

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

8
Total Citations
View Citations
231
Total Downloads

Downloads (Last 12 months)18
Downloads (Last 6 weeks)2

Reflects downloads up to 15 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

Uma KFrancis SSun WMoens M(2024)Towards Explainability in Automated Medical Code Prediction from Clinical RecordsIntelligent Systems and Applications10.1007/978-3-031-47718-8_40(593-637)Online publication date: 14-Feb-2024
https://doi.org/10.1007/978-3-031-47718-8_40
Masud JKuo CYeh CYang HLin M(2023)Applying Deep Learning Model to Predict Diagnosis Code of Medical RecordsDiagnostics10.3390/diagnostics1313229713:13(2297)Online publication date: 6-Jul-2023
https://doi.org/10.3390/diagnostics13132297
Sarwar TSeifollahi SChan JZhang XAksakalli VHudson IVerspoor KCavedon L(2023)The Secondary Use of Electronic Health Records for Data Mining: Data Characteristics and ChallengesACM Computing Surveys10.1145/349023455:2(1-40)Online publication date: 31-Mar-2023
https://doi.org/10.1145/3490234
Kaur RGinige JObst O(2023)AI-based ICD coding and classification approaches using discharge summariesExpert Systems with Applications: An International Journal10.1016/j.eswa.2022.118997213:PBOnline publication date: 1-Mar-2023
https://dl.acm.org/doi/10.1016/j.eswa.2022.118997
Owens ESheehan BMullins MCunneen MRessel JCastignani G(2022)Explainable Artificial Intelligence (XAI) in InsuranceRisks10.3390/risks1012023010:12(230)Online publication date: 1-Dec-2022
https://doi.org/10.3390/risks10120230
Masud JShun CKuo CIslam MYeh CYang HLin M(2022)Deep-ADCA: Development and Validation of Deep Learning Model for Automated Diagnosis Code Assignment Using Clinical Notes in Electronic Medical RecordsJournal of Personalized Medicine10.3390/jpm1205070712:5(707)Online publication date: 28-Apr-2022
https://doi.org/10.3390/jpm12050707
Singaravelan AHsieh CLiao YHsu J(2021)Predicting ICD-9 Codes Using Self-Report of PatientsApplied Sciences10.3390/app11211004611:21(10046)Online publication date: 27-Oct-2021
https://doi.org/10.3390/app112110046
Hsu JHsu THsieh CSingaravelan A(2020)Applying Convolutional Neural Networks to Predict the ICD-9 Codes of Medical RecordsSensors10.3390/s2024711620:24(7116)Online publication date: 11-Dec-2020
https://doi.org/10.3390/s20247116

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents