skip to main content
10.1145/3637528.3671652acmconferencesArticle/Chapter ViewAbstractPublication PageskddConference Proceedingsconference-collections
research-article
Open access

Personalised Drug Identifier for Cancer Treatment with Transformers using Auxiliary Information

Published: 24 August 2024 Publication History

Abstract

Cancer remains a global challenge due to its growing clinical and economic burden. Its uniquely personal manifestation, which makes treatment difficult, has fuelled the quest for personalized treatment strategies. Thus, genomic profiling is increasingly becoming part of clinical diagnostic panels. Effective use of such panels requires accurate drug response prediction (DRP) models, which are challenging to build due to limited labelled patient data. Previous methods to address this problem have used various forms of transfer learning. However, they do not explicitly model the variable length sequential structure of the list of mutations in such diagnostic panels. Further, they do not utilize auxiliary information (like patient survival) for model training. We address these limitations through a novel transformer-based method, which surpasses the performance of state-of-the-art DRP models on benchmark data. Code for our method is available at https://github.com/CDAL-SOC/PREDICT-AI.
We also present the design of a treatment recommendation system (TRS), which is currently deployed at the National University Hospital, Singapore and is being evaluated in a clinical trial. We discuss why the recommended drugs and their predicted scores alone, obtained from DRP models, are insufficient for treatment planning. Treatment planning for complex cancer cases, in the face of limited clinical validation, requires assessment of many other factors, including several indirect sources of evidence on drug efficacy. We discuss key lessons learnt on model validation and use of indirect supporting evidence to build clinicians' trust and aid their decision making.

Supplemental Material

MP4 File - ads0918-video.mp4
Promotional video for personalised cancer treatment recommendation using transformers and auxiliary patient information.

References

[1]
Ibrahim M Alabdulmohsin and Mario Lucic. 2021. A near-optimal algorithm for debiasing trained machine learning models. Advances in Neural Information Processing Systems, Vol. 34 (2021), 8072--8084.
[2]
Pavla Brachova, Kristina W Thiel, and Kimberly K Leslie. 2013. The consequence of oncomorphic TP53 mutations in ovarian cancer. International journal of molecular sciences, Vol. 14, 9 (2013), 19257--19275.
[3]
Marc Buyse, Tomasz Burzykowski, Kevin Carroll, Stefan Michiels, Daniel J Sargent, Langdon L Miller, Gary L Elfring, Jean-Pierre Pignon, and Pascal Piedbois. 2007. Progression-free survival is a surrogate for survival in advanced colorectal cancer. Journal of clinical oncology, Vol. 25, 33 (2007), 5218--5224.
[4]
Ethan Cerami, Jianjiong Gao, Ugur Dogrusoz, Benjamin E Gross, Selcuk Onur Sumer, Bülent Arman Aksoy, Anders Jacobsen, Caitlin J Byrne, Michael L Heuer, Erik Larsson, et al. 2012. The cBio cancer genomics portal: an open platform for exploring multidimensional cancer genomics data. Cancer discovery, Vol. 2, 5 (2012), 401--404.
[5]
Noura J Choudhury, Jessica A Lavery, Samantha Brown, Ino de Bruijn, Justin Jee, Thinh Ngoc Tran, Hira Rizvi, Kathryn C Arbour, Karissa Whiting, Ronglai Shen, et al. 2023. The GENIE BPC NSCLC Cohort: A Real-World Repository Integrating Standardized Clinical and Genomic Data for 1,846 Patients with Non--Small Cell Lung Cancer. Clinical Cancer Research, Vol. 29, 17 (2023), 3418--3428.
[6]
ClinicalTrials.gov. 2023. Neural Network-based Treatment Decision Support Tool in Patients With Refractory Solid Organ Malignancies (DRUID). https://clinicaltrials.gov/study/NCT05719428 Retrieved February 7, 2024 from
[7]
Chiara Corti, Marisa Cobanaj, Edward C Dee, Carmen Criscitiello, Sara M Tolaney, Leo A Celi, and Giuseppe Curigliano. 2023. Artificial intelligence in cancer research and precision medicine: Applications, limitations and priorities to drive transformation in the delivery of equitable and unbiased care. Cancer Treatment Reviews, Vol. 112 (2023), 102498.
[8]
David R Cox. 1972. Regression models and life-tables. Journal of the Royal Statistical Society: Series B (Methodological), Vol. 34, 2 (1972), 187--202.
[9]
Bouchra Derraz, Gabriele Breda, Christoph Kaempf, Franziska Baenke, Fabienne Cotte, Kristin Reiche, Ulrike Köhl, Jakob Nikolas Kather, Deborah Eskenazy, and Stephen Gilbert. 2024. New regulatory thinking is needed for AI-based personalised drug and cell therapies in precision oncology. NPJ Precision Oncology, Vol. 8, 1 (2024), 23.
[10]
Rudresh Dwivedi, Devam Dave, Het Naik, Smiti Singhal, Rana Omer, Pankesh Patel, Bin Qian, Zhenyu Wen, Tejal Shah, Graham Morgan, et al. 2023. Explainable AI (XAI): Core ideas, techniques, and solutions. Comput. Surveys, Vol. 55, 9 (2023), 1--33.
[11]
Stephane Fotso. 2018. Deep neural networks for survival analysis based on a multi-task framework. arXiv preprint arXiv:1801.05512 (2018).
[12]
Michael F Gensheimer and Balasubramanian Narasimhan. 2019. A scalable discrete-time survival model for neural networks. PeerJ, Vol. 7 (2019), e6257.
[13]
Mahmoud Ghandi, Franklin W Huang, Judit Jané-Valbuena, Gregory V Kryukov, Christopher C Lo, E Robert McDonald III, Jordi Barretina, Ellen T Gelfand, Craig M Bielski, Haoxin Li, et al. 2019. Next-generation characterization of the cancer cell line encyclopedia. Nature, Vol. 569, 7757 (2019), 503--508.
[14]
Sarthak Pati Haavard Kvamme, Brian Hart and Nikolai Sellereite. 2022. Survival analysis with PyTorch. https://github.com/havakv/pycox Retrieved February 8, 2024 from
[15]
Di He, Qiao Liu, You Wu, and Lei Xie. 2022. A context-aware deconfounding autoencoder for robust prediction of personalized clinical drug response from cell-line compound screening. Nature Machine Intelligence, Vol. 4, 10 (2022), 879--892.
[16]
Aishwarya Jayagopal, Robert J Walsh, Krishna Kumar Hariprasannan, Ragunathan Mariappan, Debabrata Mahapatra, Patrick William Jaynes, Diana Lim, David Shao Peng Tan, Tuan Zea Tan, Jason J Pitt, et al. 2023. A multi-task domain-adapted model to predict chemotherapy response from mutations in recurrently altered cancer genes. medRxiv (2023), 2023--11.
[17]
Yanrong Ji, Zhihan Zhou, Han Liu, and Ramana V Davuluri. 2021. DNABERT: pre-trained Bidirectional Encoder Representations from Transformers model for DNA-language in genome. Bioinformatics, Vol. 37, 15 (2021), 2112--2120.
[18]
Peilin Jia, Ruifeng Hu, Guangsheng Pei, Yulin Dai, Yin-Ying Wang, and Zhongming Zhao. 2021. Deep generative neural network for accurate drug response imputation. Nature communications, Vol. 12, 1 (2021), 1740.
[19]
Likun Jiang, Changzhi Jiang, Xinyu Yu, Rao Fu, Shuting Jin, and Xiangrong Liu. 2022. DeepTTA: a transformer-based model for predicting cancer drug response. Briefings in bioinformatics, Vol. 23, 3 (2022), bbac100.
[20]
Edward L Kaplan and Paul Meier. 1958. Nonparametric estimation from incomplete observations. Journal of the American statistical association, Vol. 53, 282 (1958), 457--481.
[21]
Jared L Katzman, Uri Shaham, Alexander Cloninger, Jonathan Bates, Tingting Jiang, and Yuval Kluger. 2018. DeepSurv: personalized treatment recommender system using a Cox proportional hazards deep neural network. BMC medical research methodology, Vol. 18, 1 (2018), 1--12.
[22]
Tadayuki Kou, Masashi Kanai, Shigemi Matsumoto, Yasushi Okuno, and Manabu Muto. 2016. The possibility of clinical sequencing in the management of cancer. Japanese Journal of Clinical Oncology, Vol. 46, 5 (2016), 399--406.
[23]
Håvard Kvamme and Ørnulf Borgan. 2019. Continuous and discrete-time survival prediction with neural networks. arXiv preprint arXiv:1910.06724 (2019).
[24]
Melissa J Landrum, Jennifer M Lee, Mark Benson, Garth R Brown, Chen Chao, Shanmuga Chitipiralla, Baoshan Gu, Jennifer Hart, Douglas Hoffman, Wonhee Jang, et al. 2018. ClinVar: improving access to variant interpretations and supporting evidence. Nucleic acids research, Vol. 46, D1 (2018), D1062--D1067.
[25]
Changhee Lee, William Zame, Jinsung Yoon, and Mihaela Van Der Schaar. 2018. Deephit: A deep learning approach to survival analysis with competing risks. In Proceedings of the AAAI conference on artificial intelligence, Vol. 32.
[26]
Seungyeoun Lee and Heeju Lim. 2019. Review of statistical methods for survival analysis using genomic data. Genomics & informatics, Vol. 17, 4 (2019).
[27]
Ginny XH Li, Dan Munro, Damian Fermin, Christine Vogel, and Hyungwon Choi. 2020. A protein-centric approach for exome variant aggregation enables sensitive association analysis with clinical outcomes. Human mutation, Vol. 41, 5 (2020), 934--945.
[28]
Claudio Luchini, Rita T Lawlor, Michele Milella, and Aldo Scarpa. 2020. Molecular tumor boards in clinical practice. Trends in Cancer, Vol. 6, 9 (2020), 738--744.
[29]
Jianzhu Ma, Samson H Fong, Yunan Luo, Christopher J Bakkenist, John Paul Shen, Soufiane Mourragui, Lodewyk FA Wessels, Marc Hafner, Roded Sharan, Jian Peng, et al. 2021. Few-shot learning creates predictive models of drug response that translate from high-throughput screens to individual patients. Nature Cancer, Vol. 2, 2 (2021), 233--244.
[30]
Coren A Milbury, James Creeden, Wai-Ki Yip, David L Smith, Varun Pattani, Kristi Maxwell, Bethany Sawchyn, Ole Gjoerup, Wei Meng, Joel Skoletsky, et al. 2022. Clinical and analytical validation of FoundationOne® CDx, a comprehensive genomic profiling assay for solid tumors. PLoS One, Vol. 17, 3 (2022), e0264138.
[31]
Mirja Mittermaier, Marium Raza, and Joseph C Kvedar. 2023. Collaborative strategies for deploying AI-based physician decision support systems: challenges and deployment approaches. npj Digital Medicine, Vol. 6, 1 (2023), 137.
[32]
Harry L Morgan. 1965. The generation of a unique machine description for chemical structures-a technique developed at chemical abstracts service. Journal of chemical documentation, Vol. 5, 2 (1965), 107--113.
[33]
Soufiane Mourragui, Marco Loog, Mark A Van De Wiel, Marcel JT Reinders, and Lodewyk FA Wessels. 2019. PRECISE: a domain adaptation approach to transfer predictors of drug response from pre-clinical models to tumors. Bioinformatics, Vol. 35, 14 (2019), i510--i519.
[34]
Eric Nguyen, Michael Poli, Marjan Faizi, Armin Thomas, Michael Wornow, Callum Birch-Sykes, Stefano Massaroli, Aman Patel, Clayton Rabideau, Yoshua Bengio, et al. 2024. Hyenadna: Long-range genomic sequence modeling at single nucleotide resolution. Advances in neural information processing systems, Vol. 36 (2024).
[35]
Alexander Partin, Thomas S Brettin, Yitan Zhu, Oleksandr Narykov, Austin Clyde, Jamie Overbeek, and Rick L Stevens. 2023. Deep learning methods for drug response prediction in cancer: predominant and emerging trends. Frontiers in medicine, Vol. 10 (2023), 1086097.
[36]
Rafael Peres da Silva, Chayaporn Suphavilai, and Niranjan Nagarajan. 2021. TUGDA: task uncertainty guided domain adaptation for robust generalization of cancer drug response prediction from in vitro to in vivo settings. Bioinformatics, Vol. 37, Supplement_1 (2021), i76--i83.
[37]
Elena Pi neiro-Yá nez, Miguel Reboiro-Jato, Gonzalo Gómez-López, Javier Perales-Patón, Kevin Troulé, José Manuel Rodríguez, Héctor Tejero, Takeshi Shimamura, Pedro Pablo López-Casas, Julián Carretero, et al. 2018. PanDrugs: a novel method to prioritize anticancer drug treatments according to individual genomic data. Genome medicine, Vol. 10 (2018), 1--11.
[38]
D Planchard, ST Popat, K Kerr, S Novello, EF Smit, Corinne Faivre-Finn, TS Mok, M Reck, PE Van Schil, MD Hellmann, et al. 2018. Metastatic non-small cell lung cancer: ESMO Clinical Practice Guidelines for diagnosis, treatment and follow-up. Annals of Oncology, Vol. 29 (2018), iv192--iv237.
[39]
Tijana Randic, Stefano Magni, Demetra Philippidou, Christiane Margue, Kamil Grzyb, Jasmin Renate Preis, Joanna Patrycja Wroblewska, Petr V Nazarov, Michel Mittelbronn, Katrin BM Frauenknecht, et al. 2023. Single-cell transcriptomics of NRAS-mutated melanoma transitioning to drug resistance reveals P2RX7 as an indicator of early drug response. Cell Reports, Vol. 42, 7 (2023).
[40]
Yash Savani, Colin White, and Naveen Sundar Govindarajulu. 2020. Intra-processing methods for debiasing neural networks. Advances in neural information processing systems, Vol. 33 (2020), 2798--2810.
[41]
Matthias Schmid, Marvin N Wright, and Andreas Ziegler. 2016. On the use of Harrell's C for clinical risk prediction via random survival forests. Expert Systems with Applications, Vol. 63 (2016), 450--459.
[42]
Hossein Sharifi-Noghabi, Parsa Alamzadeh Harjandi, Olga Zolotareva, Colin C Collins, and Martin Ester. 2021. Out-of-distribution generalization from labelled and unlabelled gene expression data for drug response prediction. Nature Machine Intelligence, Vol. 3, 11 (2021), 962--972.
[43]
Hossein Sharifi-Noghabi, Shuman Peng, Olga Zolotareva, Colin C Collins, and Martin Ester. 2020. AITL: adversarial inductive transfer learning with input and output space adaptation for pharmacogenomics. Bioinformatics, Vol. 36, Supplement_1 (2020), i380--i388.
[44]
Aubrey A Shick, Christina M Webber, Nooshin Kiarashi, Jessica P Weinberg, Aneesh Deoras, Nicholas Petrick, Anindita Saha, and Matthew C Diamond. 2024. Transparency of artificial intelligence/machine learning-enabled medical devices. NPJ Digital Medicine, Vol. 7, 1 (2024), 21.
[45]
Harald Steck, Balaji Krishnapuram, Cary Dehing-Oberije, Philippe Lambin, and Vikas C Raykar. 2007. On ranking in survival analysis: Bounds on the concordance index. Advances in neural information processing systems, Vol. 20 (2007).
[46]
David Tamborero, Rodrigo Dienstmann, Maan Haj Rachid, Jorrit Boekel, Adria Lopez-Fernandez, Markus Jonsson, Ali Razzak, Irene Bra na, Luigi De Petris, Jeffrey Yachnin, et al. 2022. The Molecular Tumor Board Portal supports clinical decisions and automated reporting for precision oncology. Nature cancer, Vol. 3, 2 (2022), 251--261.
[47]
Yifeng Tao, Chunhui Cai, William W Cohen, and Xinghua Lu. 2019. From genome to phenome: Predicting multiple cancer phenotypes based on somatic genomic alterations via the genomic impact transformer. In PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020. World Scientific, 79--90.
[48]
Patrick Therasse, Susan G Arbuck, Elizabeth A Eisenhauer, Jantien Wanders, Richard S Kaplan, Larry Rubinstein, Jaap Verweij, Martine Van Glabbeke, Allan T van Oosterom, Michaele C Christian, et al. 2000. New guidelines to evaluate the response to treatment in solid tumors. Journal of the National Cancer Institute, Vol. 92, 3 (2000), 205--216.
[49]
Apostolia M Tsimberidou, Michael Kahle, Henry Hiep Vo, Mehmet A Baysal, Amber Johnson, and Funda Meric-Bernstam. 2023. Molecular tumour boards-current and future considerations for precision oncology. Nature Reviews Clinical Oncology, Vol. 20, 12 (2023), 843--863.
[50]
Eric Van Cutsem, B Nordlinger, and A Cervantes. 2010. Advanced colorectal cancer: ESMO Clinical Practice Guidelines for treatment. Annals of oncology, Vol. 21 (2010), v93--v97.
[51]
Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. Advances in neural information processing systems, Vol. 30 (2017).
[52]
Adam Wahida, Lars Buschhorn, Stefan Fröhling, Philipp J Jost, Andreas Schneeweiss, Peter Lichter, and Razelle Kurzrock. 2023. The coming decade in precision oncology: six riddles. Nature Reviews Cancer, Vol. 23, 1 (2023), 43--54.
[53]
Ching-Yu Wang, Changxia Shao, Alicia C McDonald, Mayur M Amonkar, Wei Zhou, Edward A Bortnichak, and Xinyue Liu. 2023. Evaluation and Comparison of Real-World Databases for Conducting Research in Patients With Colorectal Cancer. JCO Clinical Cancer Informatics, Vol. 7 (2023), e2200184.
[54]
Kai Wang, Mingyao Li, and Hakon Hakonarson. 2010. ANNOVAR: functional annotation of genetic variants from high-throughput sequencing data. Nucleic acids research, Vol. 38, 16 (2010), e164--e164.
[55]
Ping Wang, Yan Li, and Chandan K Reddy. 2019. Machine learning for survival analysis: A survey. ACM Computing Surveys (CSUR), Vol. 51, 6 (2019), 1--36.
[56]
Bo Wei, John Kang, Miho Kibukawa, Gladys Arreaza, Maureen Maguire, Lei Chen, Ping Qiu, Lixin Lang, Deepti Aurora-Garg, Razvan Cristescu, et al. 2022. Evaluation of the trusight oncology 500 assay for routine clinical testing of tumor mutational burden and clinical utility for predicting response to pembrolizumab. The Journal of Molecular Diagnostics, Vol. 24, 6 (2022), 600--608.
[57]
John N Weinstein, Eric A Collisson, Gordon B Mills, Kenna R Shaw, Brad A Ozenberger, Kyle Ellrott, Ilya Shmulevich, Chris Sander, and Joshua M Stuart. 2013. The cancer genome atlas pan-cancer analysis project. Nature genetics, Vol. 45, 10 (2013), 1113--1120.
[58]
Jenna Wiens, Suchi Saria, Mark Sendak, Marzyeh Ghassemi, Vincent X Liu, Finale Doshi-Velez, Kenneth Jung, Katherine Heller, David Kale, Mohammed Saeed, et al. 2019. Do no harm: a roadmap for responsible machine learning for health care. Nature medicine, Vol. 25, 9 (2019), 1337--1340.
[59]
Michelle K Wilson, Deborah Collyar, Diana T Chingos, Michael Friedlander, Tony W Ho, Katherine Karakasis, Stan Kaye, Mahesh KB Parmar, Matthew R Sydes, Ian F Tannock, et al. 2015. Outcomes and endpoints in cancer trials: bridging the divide. The lancet oncology, Vol. 16, 1 (2015), e43--e52.
[60]
Jie Xu, Yunyu Xiao, Wendy Hui Wang, Yue Ning, Elizabeth A Shenkman, Jiang Bian, and Fei Wang. 2022. Algorithmic fairness in computational medicine. EBioMedicine, Vol. 84 (2022).
[61]
Wanjuan Yang, Jorge Soares, Patricia Greninger, Elena J Edelman, Howard Lightfoot, Simon Forbes, Nidhi Bindal, Dave Beare, James A Smith, I Richard Thompson, et al. 2012. Genomics of Drug Sensitivity in Cancer (GDSC): a resource for therapeutic biomarker discovery in cancer cells. Nucleic acids research, Vol. 41, D1 (2012), D955--D961.
[62]
Chun-Nam Yu, Russell Greiner, Hsiu-Chin Lin, and Vickie Baracos. 2011. Learning patient-specific cancer survival distributions as a sequence of dependent regressors. Advances in neural information processing systems, Vol. 24 (2011).
[63]
Ahmet Zehir, Ryma Benayed, Ronak H Shah, Aijazuddin Syed, Sumit Middha, Hyunjae R Kim, Preethi Srinivasan, Jianjiong Gao, Debyani Chakravarty, Sean M Devlin, et al. 2017. Mutational landscape of metastatic cancer revealed from prospective clinical sequencing of 10,000 patients. Nature medicine, Vol. 23, 6 (2017), 703--713.

Index Terms

  1. Personalised Drug Identifier for Cancer Treatment with Transformers using Auxiliary Information
    Index terms have been assigned to the content through auto-classification.

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    KDD '24: Proceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining
    August 2024
    6901 pages
    ISBN:9798400704901
    DOI:10.1145/3637528
    This work is licensed under a Creative Commons Attribution International 4.0 License.

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 24 August 2024

    Check for updates

    Author Tags

    1. auxiliary information
    2. cancer drug response prediction
    3. clinical deployment
    4. personalized treatment recommendation
    5. survival prediction
    6. transformers

    Qualifiers

    • Research-article

    Funding Sources

    Conference

    KDD '24
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 1,133 of 8,635 submissions, 13%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • 0
      Total Citations
    • 170
      Total Downloads
    • Downloads (Last 12 months)170
    • Downloads (Last 6 weeks)57
    Reflects downloads up to 07 Nov 2024

    Other Metrics

    Citations

    View Options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Get Access

    Login options

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media