skip to main content
10.1145/3627673.3679962acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
short-paper

MEDFuse: Multimodal EHR Data Fusion with Masked Lab-Test Modeling and Large Language Models

Published: 21 October 2024 Publication History

Abstract

Electronic health records (EHRs) are multimodal by nature, consisting of structured tabular features like lab tests and unstructured clinical notes. In real-life clinical practice, doctors use complementary multimodal EHR data sources to get a clearer picture of patients' health and support clinical decision-making. However, most EHR predictive models do not reflect these procedures, as they either focus on a single modality or overlook the inter-modality interactions/redundancy. In this work, we propose MEDFuse, a Multimodal EHR Data Fusion framework that incorporates masked lab-test modeling and large language models (LLMs) to effectively integrate structured and unstructured medical data. MEDFuse leverages multimodal embeddings extracted from two sources: LLMs fine-tuned on free clinical text and masked tabular transformers trained on structured lab test results. We design a disentangled transformer module, optimized by a mutual information loss to 1) decouple modality-specific and modality-shared information and 2) extract useful joint representation from the noise and redundancy present in clinical notes. Through comprehensive validation on the public MIMIC-III dataset and the in-house FEMH dataset, MEDFuse demonstrates great potential in advancing clinical predictions, achieving over 90% F1 score in the 10-disease multi-label classification task.

References

[1]
AI@Meta. 2024. Llama 2 Model Card. (2024). https://huggingface.co/meta-llama/Llama-2--13b
[2]
AI@Meta. 2024. Llama 3 Model Card. (2024). https://github.com/meta-llama/llama3/blob/main/MODEL_CARD.md
[3]
David R Bellamy, Bhawesh Kumar, Cindy Wang, and Andrew Beam. 2023. Labrador: Exploring the Limits of Masked Language Modeling for Laboratory Data. arXiv preprint arXiv:2312.11502 (2023).
[4]
Elliot Bolton, Abhinav Venigalla, Michihiro Yasunaga, David Hall, Betty Xiong, Tony Lee, Roxana Daneshjou, Jonathan Frankle, Percy Liang, Michael Carbin, et al. 2024. BioMedLM: A 2.7 B Parameter Language Model Trained On Biomedical Text. arXiv preprint arXiv:2403.18421 (2024).
[5]
Tianxi Cai, Feiqing Huang, Ryumei Nakada, Linjun Zhang, and Doudou Zhou. 2024. Contrastive Learning on Multimodal Analysis of Electronic Health Records. arXiv preprint arXiv:2403.14926 (2024).
[6]
Jindong Chen, Yizhou Hu, Jingping Liu, Yanghua Xiao, and Haiyun Jiang. 2019. Deep short text classification with knowledge powered attention. In Proceedings of the AAAI conference on artificial intelligence, Vol. 33. 6252--6259.
[7]
Yahui Chen. 2015. Convolutional neural network for sentence classification. Master's thesis. University of Waterloo.
[8]
Yinda Chen, Haoyuan Shi, Xiaoyu Liu, Te Shi, Ruobing Zhang, Dong Liu, Zhiwei Xiong, and Feng Wu. 2024. TokenUnify: Scalable Autoregressive Visual Pre-training with Mixture Token Prediction. arXiv preprint arXiv:2405.16847 (2024).
[9]
Edward Choi, Mohammad Taha Bahadori, Jimeng Sun, Joshua Kulas, Andy Schuetz, and Walter Stewart. 2016. Retain: An interpretable predictive model for healthcare using reverse time attention mechanism. Advances in neural information processing systems, Vol. 29 (2016).
[10]
Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2018. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018).
[11]
Jun-En Ding, Phan Nguyen Minh Thao, Wen-Chih Peng, Jian-Zhe Wang, Chun-Cheng Chug, Min-Chen Hsieh, Yun-Chien Tseng, Ling Chen, Dongsheng Luo, Chi-Te Wang, et al. 2024. Large Language Multimodal Models for 5-Year Chronic Disease Cohort Prediction Using EHR Data. arXiv preprint arXiv:2403.04785 (2024).
[12]
Yaqian Dun, Kefei Tu, Chen Chen, Chunyan Hou, and Xiaojie Yuan. 2021. Kan: Knowledge-aware attention network for fake news detection. In Proceedings of the AAAI conference on artificial intelligence, Vol. 35. 81--89.
[13]
Léo Grinsztajn, Edouard Oyallon, and Gaël Varoquaux. 2022. Why do tree-based models still outperform deep learning on typical tabular data? Advances in neural information processing systems, Vol. 35 (2022), 507--520.
[14]
Kaiming He, Xinlei Chen, Saining Xie, Yanghao Li, Piotr Dollár, and Ross Girshick. 2022. Masked autoencoders are scalable vision learners. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 16000--16009.
[15]
Stefan Hegselmann, Alejandro Buendia, Hunter Lang, Monica Agrawal, Xiaoyi Jiang, and David Sontag. 2023. Tabllm: Few-shot classification of tabular data with large language models. In International Conference on Artificial Intelligence and Statistics. PMLR, 5549--5581.
[16]
Gregory Holste, Douwe van der Wal, Hans Pinckaers, Rikiya Yamashita, Akinori Mitani, and Andre Esteva. 2023. Improved Multimodal Fusion for Small Datasets with Auxiliary Supervision. In 2023 IEEE 20th International Symposium on Biomedical Imaging (ISBI). IEEE, 1--5.
[17]
Mohammad Hossin and Md Nasir Sulaiman. 2015. A review on evaluation metrics for data classification evaluations. International journal of data mining & knowledge management process, Vol. 5, 2 (2015), 1.
[18]
Albert Q Jiang, Alexandre Sablayrolles, Arthur Mensch, Chris Bamford, Devendra Singh Chaplot, Diego de las Casas, Florian Bressand, Gianna Lengyel, Guillaume Lample, Lucile Saulnier, et al. 2023. Mistral 7B. arXiv preprint arXiv:2310.06825 (2023).
[19]
Alistair EW Johnson, Tom J Pollard, Lu Shen, Li-wei H Lehman, Mengling Feng, Mohammad Ghassemi, Benjamin Moody, Peter Szolovits, Leo Anthony Celi, and Roger G Mark. 2016. MIMIC-III, a freely accessible critical care database. Scientific data, Vol. 3, 1 (2016), 1--9.
[20]
Yikuan Li, Shishir Rao, José Roberto Ayala Solares, Abdelaali Hassaine, Rema Ramakrishnan, Dexter Canoy, Yajie Zhu, Kazem Rahimi, and Gholamreza Salimi-Khorshidi. 2020. BEHRT: transformer for electronic health records. Scientific reports, Vol. 10, 1 (2020), 7155.
[21]
Paul Pu Liang, Yun Cheng, Xiang Fan, Chun Kai Ling, Suzanne Nie, Richard Chen, Zihao Deng, Faisal Mahmood, Ruslan Salakhutdinov, and Louis-Philippe Morency. 2023. Quantifying & modeling feature interactions: An information decomposition framework. arXiv e-prints (2023), arXiv--2302.
[22]
Zachary C Lipton, Charles Elkan, and Balakrishnan Naryanaswamy. 2014. Optimal thresholding of classifiers to maximize F1 measure. In Machine Learning and Knowledge Discovery in Databases: European Conference, ECML PKDD 2014, Nancy, France, September 15--19, 2014. Proceedings, Part II 14. Springer, 225--239.
[23]
Yinhan Liu, Myle Ott, Naman Goyal, Jingfei Du, Mandar Joshi, Danqi Chen, Omer Levy, Mike Lewis, Luke Zettlemoyer, and Veselin Stoyanov. 2019. Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019).
[24]
Junyu Luo, Muchao Ye, Cao Xiao, and Fenglong Ma. 2020. Hitanet: Hierarchical time-aware attention networks for risk prediction on electronic health records. In Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 647--656.
[25]
Fenglong Ma, Radha Chitta, Jing Zhou, Quanzeng You, Tong Sun, and Jing Gao. 2017. Dipole: Diagnosis prediction in healthcare via attention-based bidirectional recurrent neural networks. In Proceedings of the 23rd ACM SIGKDD international conference on knowledge discovery and data mining. 1903--1911.
[26]
Julio-Omar Palacio-Ni no and Fernando Berzal. 2019. Evaluation metrics for unsupervised learning algorithms. arXiv preprint arXiv:1905.05667 (2019).
[27]
David Restrepo, Chenwei Wu, Constanza Vásquez-Venegas, Luis Filipe Nakayama, Leo Anthony Celi, and Diego M López. 2024. DF-DM: A foundational process model for multimodal data fusion in the artificial intelligence era. arXiv preprint arXiv:2404.12278 (2024).
[28]
ruslanmv. 2024. Medical-Llama3--8B-16bit: Fine-Tuned Llama3 for Medical Q&A. (2024). https://huggingface.co/ruslanmv/Medical-Llama3--8B
[29]
Xingchen Song, Di Wu, Binbin Zhang, Zhendong Peng, Bo Dang, Fuping Pan, and Zhiyong Wu. 2023. Zeroprompt: Streaming acoustic encoders are zero-shot masked lms. arXiv preprint arXiv:2305.10649 (2023).
[30]
Arun James Thirunavukarasu, Darren Shu Jeng Ting, Kabilan Elangovan, Laura Gutierrez, Ting Fang Tan, and Daniel Shu Wei Ting. 2023. Large language models in medicine. Nature medicine, Vol. 29, 8 (2023), 1930--1940.
[31]
Hugo Touvron, Louis Martin, Kevin Stone, Peter Albert, Amjad Almahairi, Yasmine Babaei, Nikolay Bashlykov, Soumya Batra, Prajjwal Bhargava, Shruti Bhosale, et al. 2023. Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288 (2023).
[32]
Haoran Wu, Wei Chen, Shuang Xu, and Bo Xu. 2021. Counterfactual supporting facts extraction for explainable medical record based diagnosis with graph network. In Proceedings of the 2021 conference of the north American chapter of the association for computational linguistics: human language technologies. 1942--1955.
[33]
Xianli Zhang, Buyue Qian, Shilei Cao, Yang Li, Hang Chen, Yefeng Zheng, and Ian Davidson. 2020. INPREM: An interpretable and trustworthy predictive model for healthcare. In Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 450--460.
[34]
Yilan Zhang, Yingxue Xu, Jianqi Chen, Fengying Xie, and Hao Chen. 2024. Prototypical Information Bottlenecking and Disentangling for Multimodal Cancer Survival Prediction. arXiv preprint arXiv:2401.01646 (2024).
[35]
Lijuan Zheng, Zihan Wang, Junqiang Liang, Shifan Luo, and Senping Tian. 2021. Effective compression and classification of ECG arrhythmia by singular value decomposition. Biomedical Engineering Advances, Vol. 2 (2021), 100013.

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
CIKM '24: Proceedings of the 33rd ACM International Conference on Information and Knowledge Management
October 2024
5705 pages
ISBN:9798400704369
DOI:10.1145/3627673
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 21 October 2024

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. computer-aided diagnosis
  2. electronic health records
  3. large language model fine-tuning

Qualifiers

  • Short-paper

Conference

CIKM '24
Sponsor:

Acceptance Rates

Overall Acceptance Rate 1,861 of 8,427 submissions, 22%

Upcoming Conference

CIKM '25

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 115
    Total Downloads
  • Downloads (Last 12 months)115
  • Downloads (Last 6 weeks)59
Reflects downloads up to 23 Dec 2024

Other Metrics

Citations

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media