skip to main content
10.1145/3230348.3230357acmotherconferencesArticle/Chapter ViewAbstractPublication PagesiciebConference Proceedingsconference-collections
research-article

ICD-9 Tagging of Clinical Notes Using Topical Word Embedding

Published: 25 April 2018 Publication History

Abstract

Medical records, which contains text, has been dramatically increasing everyday. This means that there is a greater need of analyzing health information in a better way. And this can be done through document classification in natural language applications. In this study, we describe tagging of patient notes with ICD-9 codes through topical word embedding in deep learning called EnHANs. We formulate this paper as a multi-label, multi-class classification problem to categorize the ICD-9 codes of a dataset with 400,000 critical care unit medical records. Knowing accurate diagnosis using ICD-9 codes is a vital information for billing and insurance claims. We demonstrate that through the use of topical word embedding model, we learn to classify patient notes with their corresponding ICD-9 labels moderately well than single-label classification.

References

[1]
L. Lenc, and P. Král, Deep neural networks for czech multi-label document classification, In Proceedings of the 17th International Conference on Intelligent Text Processing and Computational Linguistics, arXiv preprint arXiv: 1701.03849, 2016.
[2]
W. Cleverley, and J. Cleverley, Essentials of health care finance (8th Edition), Jones and Bartlett Learning, Burlington, Massachusetts, USA, 2017.
[3]
S. Monteith, T. Glenn, J. Geddes, and M. Bauer, Big data are coming to psychiatry: A general introduction, International Journal of Bipolar Disorders, 3, 1, 21, 2015.
[4]
R. Escorpizo, N. Kostanjsek, C. Kennedy, M. Nicol, G. Stucki, and T. Ustün, Harmonizing WHO's international classification of diseases (ICD) and international classification of functioning, disability and health (ICF): Importance and methods to link disease and functioning, BMC Public Health, 13, 1, 742, 2013.
[5]
S. Feder, N. Redeker, S. Jeon, D. Schulman-Green, J.A. Womack, J. Tate, R. Bedimo, M. Budoff, A. Butt, K. Crothers, and K. Akgün, Validation of the ICD-9 diagnostic code for palliative care in patients hospitalized with heart failure within the veterans health administration, American Journal of Hospice and Palliative Medicine, 1049909117747519, 2017.
[6]
Zhang, M., and Zhi-Hua Z. 2014. A review on multi-label learning algorithms, In Proceedings of the IEEE Transactions on Knowledge and Data Engineering, 2, 6:8, 1819--1837.
[7]
Joulin, A., Grave, E., Bojanowski, P., Douze, M., Jégou, H. and Mikolov, T. 2016. Fasttext. zip: Compressing text classification models, arXiv preprint arXiv:1612.03651.
[8]
Kavuluru, R., Rios, A., and Lu, Y., 2015. An empirical evaluation of supervised learning approaches in assigning diagnosis codes to electronic medical records. In Journal in Artificial Intelligence in Medicine, 65, 2, 155--166.
[9]
Baumel, T., Nassour-Kassis, J., Elhadad, M., and Elhadad, N., 2017. Multi-label classification of patient notes a case study on ICD code assignment, arXiv preprint arXiv: 1709.09587.
[10]
Demner-Fushman, D., and Elhadad, N. 2016. Aspiring to unintended consequences of natural language processing: A review of recent developments in clinical and consumer-generated text processing, Yearbook of Medical Informatics, 1, 224.
[11]
Yang, Z., Yang, D., Dyer, C., He, X., Smola, A., and Hovy, E. 2016. Hierarchical attention networks for document classification, In Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 1480--1489.
[12]
Y. Liu, Z. Liu, T.S. Chua, and M. Sun, Topical word embeddings, Proc Twenty-Ninth AAAI Conference on Artificial Intelligence AAAI, 2418--2424, 2015.
[13]
Johnson, A., Pollard, T., Shen, L., Lehman, L., Feng, M., Ghassemi, M., and Mark, R. 2016. MIMIC-III, A freely accessible critical care database, Scientific Data, 3, 160035.
[14]
Chollet, F. 2017. Deep Learning with Python, Manning Publications.
[15]
Shickel, B., Tighe, P., Bihorac, A., and Rashidi, P. 2017. Deep EHR: A survey of recent advances in deep learning techniques for electronic health record (EHR) analysis, In IEEE Journal of Biomedical and Health Informatics.
[16]
Mennemeyer, S., Menachemi, N., Rahurkar, S., and Ford, E., 2016. Impact of the HITECH act on physicians' adoption of electronic health records, In Journal of the American Medical Informatics Association, 23, 2, 375--379.
[17]
DeSalvo, K., Dinkler, A., and Stevens, L. 2015. The US office of the national coordinator for health information technology: Progress and promise for the future at the 10-year mark, Annals of Emergency Medicine, 66, 5, 507--510.
[18]
Johnson, A., Altmark, R., Weinstein, M., Pitt, H., Yeo, C., and Cowan, S. 2017. Predicting the risk of postoperative respiratory failure in elective abdominal and vascular operations using the national surgical quality improvement program (NSQIP) participant use data file. Annals of Surgery, 266, 6, 968--974.
[19]
Kaufman, B., Spivack, B., Stearns, S., Song, P., and O'Brien, E. 2017. Impact of accountable care organizations on utilization, care, and outcomes: A systematic review, Medical Care Research and Review, 1077558717745916.
[20]
Zheng, K., Gao, J., Ngiam, K., Ooi, B. and Yip, W. 2017. Resolving the bias in electronic medical records, In Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2171--2180.
[21]
Babbar, R., and Schölkopf, B., 2017. DiSMEC: Distributed sparse machines for extreme multi-label classification, In Proceedings of the Tenth ACM International Conference on Web Search and Data Mining, 721--729.
[22]
Feremans, L., Cule, B., Vens, C., and Goethals, B. 2017. Combining Instance and Feature neighbors for Efficient Multi-label Classification.
[23]
Prabhu, Y., Kag, A., Gopinath, S., Dahiya, K., Harsola, S., Agrawal, R., and Varma, M. 2018. Extreme Multi-label Learning with Label Features for Warm-start Tagging, Ranking & Recommendation.
[24]
Zheng, T., Xie, W., Xu, L., He, X., Zhang, Y., You, M., Yang, G., and Chen, Y. 2017. A machine learning-based framework to identify type 2 diabetes through electronic health record, In International Journal of Medical Informatics, 1, 97, 120--7.
[25]
Zheng, T., Xie, W., Xu, L., He, X., Zhang, Y., You, M., Yang, G., and Chen, Y. 2017. A machine learning-based framework to identify type 2 diabetes through electronic health record, In International Journal of Medical Informatics, 1, 97, 120--7
[26]
Yu, L., Hermann, K. M., Blunsom, P., and Pulman, S. 2014. Deep learning for answer sentence selection, NIPS Deep Learning Workshop, arXiv preprint arXiv:1412.1632.
[27]
Such, F., Madhavan, V., Conti, E., Lehman, J., Stanley, K., and Clune, J. 2017. Deep neuroevolution: Genetic algorithms are a competitive alternative for training deep neural networks for reinforcement learning, arXiv preprint arXiv:1712.06567.
[28]
Sproat, R., and Jaitly, N. 2016. RNN approaches to text normalization: A challenge, arXiv preprint arXiv:1611.00068.
[29]
Rao, K., Peng, F., Sak, H., and Beaufays, F. 2015. Grapheme-to-phoneme conversion using long short-term mermory recurrent neural networks, In Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing, 4225--4229.
[30]
Liu, P., Qiu, X., and Huang, X. 2015. Learning context-sensitive word embeddings with neural tensor skip-gram model, In Proceedings of the International Joint Conference on Artificial Intelligence (IJCAI2015), 1284--1290.
[31]
Allahyari, M., Pouriyeh, S., Assefi, M., Safaei, S., Trippe, E.D., Gutierrez, J.B. and Kochut, K., 2017. Text summarization techniques: A brief survey, arXiv preprint arXiv: 1707.02268.
[32]
Sandfort, V., Johnson, A., Kunz, L., Vargas, J., and Rosing, D. 2018. Prolonged elevated heart rate and 90-Day survival in acutely Ill patients: Data from the MIMIC-III database, In Journal of Intensive Care Medicine, 0885066618756828.
[33]
Johnson, A., Stone, D., Celi, L., and Pollard, T., 2017. The MIMIC code repository: Enabling reproducibility in critical care research, In Journal of the American Medical Informatics Association, 25, 1, 32--39.
[34]
Yu, Z., Ramanarayanan, V., Suendermann-Oeft, D., Wang, X., Zechner, K., Chen, L., and Qian, Y. 2015. Using bidirectional LSTM recurrent neural networks to learn high-level abstractions of sequential features for automated scoring of non-native spontaneous speech, In Proceedings of the 2015 IEEE Workshop on Automatic Speech Recognition and Understanding, ASRU 2015, 338--345.

Cited By

View all

Index Terms

  1. ICD-9 Tagging of Clinical Notes Using Topical Word Embedding

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Other conferences
    ICIEB '18: Proceedings of the 2018 1st International Conference on Internet and e-Business
    April 2018
    389 pages
    ISBN:9781450363754
    DOI:10.1145/3230348
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    In-Cooperation

    • Wuhan Univ.: Wuhan University, China
    • City University of Hong Kong: City University of Hong Kong

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 25 April 2018

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Hierarchical Attention Networks
    2. ICD-9 codes
    3. Recurrent Neural Network
    4. Topical Word Embedding

    Qualifiers

    • Research-article
    • Research
    • Refereed limited

    Conference

    ICIEB '18

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)18
    • Downloads (Last 6 weeks)2
    Reflects downloads up to 15 Jan 2025

    Other Metrics

    Citations

    Cited By

    View all

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media