skip to main content
10.1145/2783258.2783352acmconferencesArticle/Chapter ViewAbstractPublication PageskddConference Proceedingsconference-collections
research-article

Temporal Phenotyping from Longitudinal Electronic Health Records: A Graph Based Framework

Published: 10 August 2015 Publication History

Abstract

The rapid growth in the development of healthcare information systems has led to an increased interest in utilizing the patient Electronic Health Records (EHR) for assisting disease diagnosis and phenotyping. The patient EHRs are generally longitudinal and naturally represented as medical event sequences, where the events include clinical notes, problems, medications, vital signs, laboratory reports, etc. The longitudinal and heterogeneous properties make EHR analysis an inherently difficult challenge. To address this challenge, in this paper, we develop a novel representation, namely the temporal graph, for such event sequences. The temporal graph is informative for a variety of challenging analytic tasks, such as predictive modeling, since it can capture temporal relationships of the medical events in each event sequence. By summarizing the longitudinal data, the temporal graphs are also robust and resistant to noisy and irregular observations. Based on the temporal graph representation, we further develop an approach for temporal phenotyping to identify the most significant and interpretable graph basis as phenotypes. This helps us better understand the disease evolving patterns. Moreover, by expressing the temporal graphs with the phenotypes, the expressing coefficients can be used for applications such as personalized medicine, disease diagnosis, and patient segmentation. Our temporal phenotyping framework is also flexible to incorporate semi-supervised/supervised information. Finally, we validate our framework on two real-world tasks. One is predicting the onset risk of heart failure. Another is predicting the risk of heart failure related hospitalization for patients with COPD pre-condition. Our results show that the diagnosis performance in both tasks can be improved significantly by the proposed approaches. Also, we illustrate some interesting phenotypes derived from the data.

References

[1]
Data driven healthcare. MIT Technology Review Business Report, 117 (5): 1--19, 2014.
[2]
Shiyu Chang, Charu C Aggarwal, and Thomas S Huang. Learning local semantic distances with limited supervision. ICDM, 2014.
[3]
JohnW Dean and MaxJ Lab. Arrhythmia in heart failure: role of mechanically induced changes in electrophysiology. The Lancet, 333 (8650), 1989.
[4]
David Gotz, Fei Wang, and Adam Perer. A methodology for interactive mining and visual analysis of clinical event patterns using electronic health record data. Journal of biomedical informatics, 48, 2014.
[5]
Ho, Ghosh, Steinhubl, Stewart, Denny, Malin, and Sun}ho2014limestoneJoyce C Ho, Joydeep Ghosh, Steve R Steinhubl, Walter F Stewart, Joshua C Denny, Bradley A Malin, and Jimeng Sun. Limestone: High-throughput candidate phenotype generation via tensor factorization. Journal of biomedical informatics, 52: 199--211, 2014\natexlaba.
[6]
Ho, Ghosh, and Sun}ho2014marbleJoyce C Ho, Joydeep Ghosh, and Jimeng Sun. Marble: high-throughput phenotyping from electronic health records via sparse nonnegative tensor factorization. KDD, 2014\natexlabb.
[7]
George Hripcsak and David J Albers. Next-generation phenotyping of electronic health records. Journal of the American Medical Informatics Association, 20 (1): 117--121, 2013.
[8]
Peter B Jensen, Lars J Jensen, and Søren Brunak. Mining electronic health records: towards better research applications and clinical care. Nature Reviews Genetics, 13 (6): 395--405, 2012.
[9]
David Kale, Zhengping Che, and Yan Liu. Computational discovery of physiomes in critically ill children using deep learning. Workshop DMMI in AMIA, 2014.
[10]
Keogh, Chakrabarti, Mehrotra, and Pazzani}keogh2001locallyEamonn Keogh, Kaushik Chakrabarti, Sharad Mehrotra, and Michael Pazzani. Locally adaptive dimensionality reduction for indexing large time series databases. SIGMOD, 2001\natexlaba.
[11]
Keogh, Chakrabarti, Pazzani, and Mehrotra}keogh2001dimensionalityEamonn Keogh, Kaushik Chakrabarti, Michael Pazzani, and Sharad Mehrotra. Dimensionality reduction for fast similarity search in large time series databases. Knowledge and information Systems, 3 (3): 263--286, 2001\natexlabb.
[12]
Keogh, Chu, Hart, and Pazzani}keogh2001onlineEamonn Keogh, Selina Chu, David Hart, and Michael Pazzani. An online algorithm for segmenting time series. ICDM, 2001\natexlabc.
[13]
Thomas A Lasko, Joshua C Denny, and Mia A Levy. Computational phenotype discovery using unsupervised feature learning over noisy, sparse, and irregular clinical data. PloS one, 8 (6): e66341, 2013.
[14]
Daniel Levy, Martin G Larson, Ramachandran S Vasan, William B Kannel, and Kalon KL Ho. The progression from hypertension to congestive heart failure. Jama, 275 (20): 1557--1562, 1996.
[15]
Chih-Jen Lin. Projected gradient methods for nonnegative matrix factorization. Neural computation, 19 (10): 2756--2779, 2007.
[16]
Jessica Lin, Eamonn Keogh, Stefano Lonardi, and Bill Chiu. A symbolic representation of time series, with implications for streaming algorithms. SIGMOD workshop on Research issues in DMKD, 2003.
[17]
Chuanren Liu, Kai Zhang, Hui Xiong, Geoff Jiang, and Qiang Yang. Temporal skeletonization on sequential data: Patterns, categorization, and visualization. KDD, 2014.
[18]
Laura B. Madsen. Data-Driven Healthcare: How Analytics and BI are Transforming the Industry. Wiley, 2014.
[19]
Fabian Mörchen and Dmitriy Fradkin. Robust mining of time intervals with semi-interval partial order patterns. SDM, 2010.
[20]
hen and Ultsch(2007)}morchen2007efficientFabian Mörchen and Alfred Ultsch. Efficient mining of understandable patterns from multivariate interval time series. Data Mining and Knowledge Discovery, 15 (2): 181--215, 2007.
[21]
Robert Moskovitch and Yuval Shahar. Medical temporal-knowledge discovery via temporal abstraction. AMIA, 2009.
[22]
Jyotishman Pathak, Abel N Kho, and Joshua C Denny. Electronic health records-driven phenotyping: challenges, recent advances, and perspectives. Journal of the American Medical Informatics Association, 20 (e2): e206--e211, 2013.
[23]
Adam Perer and Fei Wang. Frequence: interactive mining and visualization of temporal frequent event sequences. IUI, 2014.
[24]
Yuval Shahar and Mark A Musen. Knowledge-based temporal abstraction in clinical domains. Artificial intelligence in medicine, 8 (3): 267--298, 1996.
[25]
Michael Stacey and Carolyn McGregor. Temporal abstraction in intelligent clinical data analysis: A survey. Artificial intelligence in medicine, 39 (1), 2007.
[26]
Gregor Stiglic, Nigam H. Shah, Niels Peek, and Fei Wang. Workshop at amia on data mining for medical informatics: Electronic phenotyping. Nov 15, 2014.
[27]
Stephan von Haehling, Wolfram Doehner, and Stefan D Anker. Nutrition, metabolism, and the complex pathophysiology of cachexia in chronic heart failure. Cardiovascular research, 73 (2): 298--309, 2007.
[28]
Fei Wang, Noah Lee, Jianying Hu, Jimeng Sun, Shahram Ebadollahi, and Andrew F Laine. A framework for mining signatures from event sequences and its applications in healthcare data. IEEE TPAMI on, 35 (2): 272--285, 2013.
[29]
Jiayu Zhou, Fei Wang, Jianying Hu, and Jieping Ye. From micro to macro: Data driven phenotyping by densification of longitudinal electronic medical records. KDD, 2014.

Cited By

View all
  • (2024)Introducing Attribute Association Graphs to Facilitate Medical Data Exploration: Development and Evaluation Using Epidemiological Study DataJMIR Medical Informatics10.2196/4986512(e49865)Online publication date: 24-Jul-2024
  • (2024)TACCO: Task-guided Co-clustering of Clinical Concepts and Patient Visits for Disease Subtyping based on EHR DataProceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3637528.3671594(6324-6334)Online publication date: 25-Aug-2024
  • (2024)Temporal self-attention for risk prediction from electronic health records using non-stationary kernel approximationArtificial Intelligence in Medicine10.1016/j.artmed.2024.102802149:COnline publication date: 1-Mar-2024
  • Show More Cited By

Index Terms

  1. Temporal Phenotyping from Longitudinal Electronic Health Records: A Graph Based Framework

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    KDD '15: Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
    August 2015
    2378 pages
    ISBN:9781450336642
    DOI:10.1145/2783258
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 10 August 2015

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. electronic health records
    2. regularization
    3. temporal graph
    4. temporal phenotyping

    Qualifiers

    • Research-article

    Conference

    KDD '15
    Sponsor:

    Acceptance Rates

    KDD '15 Paper Acceptance Rate 160 of 819 submissions, 20%;
    Overall Acceptance Rate 1,133 of 8,635 submissions, 13%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)166
    • Downloads (Last 6 weeks)20
    Reflects downloads up to 03 Nov 2024

    Other Metrics

    Citations

    Cited By

    View all

    View Options

    Get Access

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media