skip to main content
research-article

SG-Fusion: : A swin-transformer and graph convolution-based multi-modal deep neural network for glioma prognosis

Published: 01 November 2024 Publication History

Abstract

The integration of morphological attributes extracted from histopathological images and genomic data holds significant importance in advancing tumor diagnosis, prognosis, and grading. Histopathological images are acquired through microscopic examination of tissue slices, providing valuable insights into cellular structures and pathological features. On the other hand, genomic data provides information about tumor gene expression and functionality. The fusion of these two distinct data types is crucial for gaining a more comprehensive understanding of tumor characteristics and progression. In the past, many studies relied on single-modal approaches for tumor diagnosis. However, these approaches had limitations as they were unable to fully harness the information from multiple data sources. To address these limitations, researchers have turned to multi-modal methods that concurrently leverage both histopathological images and genomic data. These methods better capture the multifaceted nature of tumors and enhance diagnostic accuracy. Nonetheless, existing multi-modal methods have, to some extent, oversimplified the extraction processes for both modalities and the fusion process. In this study, we presented a dual-branch neural network, namely SG-Fusion. Specifically, for the histopathological modality, we utilize the Swin-Transformer structure to capture both local and global features and incorporate contrastive learning to encourage the model to discern commonalities and differences in the representation space. For the genomic modality, we developed a graph convolutional network based on gene functional and expression level similarities. Additionally, our model integrates a cross-attention module to enhance information interaction and employs divergence-based regularization to enhance the model’s generalization performance. Validation conducted on glioma datasets from the Cancer Genome Atlas unequivocally demonstrates that our SG-Fusion model outperforms both single-modal methods and existing multi-modal approaches in both survival analysis and tumor grading.

Highlights

Developed a multi-modal, multi-task framework for glioma diagnosis.
Integrated Swin-Transformer v2 with contrastive learning to enhance image features.
Implemented a novel gene selection method to reduce data redundancy.
Enhanced feature quality with cross-modal attention and divergence regularization.

References

[1]
Louis D.N., Gerber G.K., Baron J.M., Bry L., Dighe A.S., Getz G., et al., Computational pathology: an emerging definition, Arch Pathol Lab Med 138 (9) (2014) 1133–1138.
[2]
Hanahan D., Weinberg R.A., Hallmarks of cancer: The next generation, cell 144 (5) (2011) 646–674.
[3]
Bray F., Ferlay J., Soerjomataram I., Siegel R.L., Torre L.A., Jemal A., Global cancer statistics 2018: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries, CA: Cancer J Clin 68 (6) (2018) 394–424.
[4]
Esteva A., Kuprel B., Novoa R.A., Ko J., Swetter S.M., Blau H.M., et al., Dermatologist-level classification of skin cancer with deep neural networks, nature 542 (7639) (2017) 115–118.
[5]
Wu Y., Cheng M., Huang S., Pei Z., Zuo Y., Liu J., et al., Recent advances of deep learning for computational histopathology: principles and applications, Cancers 14 (5) (2022) 1199.
[6]
Lee K., Lockhart J.H., Xie M., Chaudhary R., Slebos R.J., Flores E.R., et al., Deep learning of histopathology images at the single cell level, Front Artif Intell 4 (2021).
[7]
Hamida A.B., Devanne M., Weber J., Truntzer C., Derangère V., Ghiringhelli F., et al., Deep learning for colon cancer histopathological images analysis, Comput Biol Med 136 (2021).
[8]
Hameed Z., Zahia S., Garcia-Zapirain B., Javier Aguirre J., Maria Vanegas A., Breast cancer histopathology image classification using an ensemble of deep learning models, Sensors 20 (16) (2020) 4373.
[9]
Fumet J.D., Truntzer C., Yarchoan M., Ghiringhelli F., Tumour mutational burden as a biomarker for immunotherapy: Current data and emerging concepts, Eur J Cancer 131 (2020) 40–50.
[10]
He B., Bergenstråhle L., Stenbeck L., Abid A., Andersson A., Borg Å., et al., Integrating spatial gene expression and breast tumour morphology via deep learning, Nature Biomed Eng 4 (8) (2020) 827–834.
[11]
Schmauch B., Romagnoni A., Pronier E., Saillard C., Maillé P., Calderaro J., et al., A deep learning model to predict RNA-seq expression of tumours from whole slide images, Nature Commun 11 (1) (2020) 3877.
[12]
Avsec Ž., Agarwal V., Visentin D., Ledsam J.R., Grabska-Barwinska A., Taylor K.R., et al., Effective gene expression prediction from sequence by integrating long-range interactions, Nature Methods 18 (10) (2021) 1196–1203.
[13]
Yanase J., Triantaphyllou E., A systematic survey of computer-aided diagnosis in medicine: Past and present developments, Expert Syst Appl 138 (2019).
[14]
Larrazabal A.J., Nieto N., Peterson V., Milone D.H., Ferrante E., Gender imbalance in medical imaging datasets produces biased classifiers for computer-aided diagnosis, Proc Natl Acad Sci 117 (23) (2020) 12592–12594.
[15]
Chan H.P., Hadjiiski L.M., Samala R.K., Computer-aided diagnosis in the era of deep learning, Med Phys 47 (5) (2020) e218–e227.
[16]
Larrazabal A.J., Nieto N., Peterson V., Milone D.H., Ferrante E., Gender imbalance in medical imaging datasets produces biased classifiers for computer-aided diagnosis, Proc Natl Acad Sci 117 (23) (2020) 12592–12594.
[17]
Domingues I., Pereira G., Martins P., Duarte H., Santos J., Abreu P.H., Using deep learning techniques in medical imaging: A systematic review of applications on CT and PET, Artif Intell Rev 53 (2020) 4093–4160.
[18]
Fu M., Zhang N., Huang Z., Zhou C., Zhang X., Yuan J., et al., Oif-net: An optical flow registration-based pet/mr cross-modal interactive fusion network for low-count brain pet image denoising, IEEE Trans Med Imaging (2023).
[19]
Wang G., Ye J.C., De Man B., Deep learning for tomographic image reconstruction, Nat Mach Intell 2 (12) (2020) 737–748.
[20]
Ilse M., Tomczak J., Welling M., Attention-based deep multiple instance learning, in: International conference on machine learning, PMLR, 2018, pp. 2127–2136.
[21]
Lu M.Y., Williamson D.F., Chen T.Y., Chen R.J., Barbieri M., Mahmood F., Data-efficient and weakly supervised computational pathology on whole-slide images, Nat Biomed Eng 5 (6) (2021) 555–570.
[22]
Lerousseau M., Vakalopoulou M., Classe M., Adam J., Battistella E., Carré A., et al., Weakly supervised multiple instance learning histopathological tumor segmentation, in: Medical image computing and computer assisted intervention–mICCAI 2020: 23rd international conference, lima, peru, October 4–8, 2020, proceedings, part v 23, Springer, 2020, pp. 470–479.
[23]
Zhao Y, Yang F, Fang Y, Liu H, Zhou N, Zhang J, et al. Predicting lymph node metastasis using histopathological images based on multiple instance learning with deep graph convolution. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2020, p. 4837–46.
[24]
Li Z., Zhao W., Shi F., Qi L., Xie X., Wei Y., et al., A novel multiple instance learning framework for COVID-19 severity assessment via data augmentation and self-supervised learning, Med Image Anal 69 (2021).
[25]
Lin T, Yu Z, Hu H, Xu Y, Chen CW. Interventional bag multi-instance learning on whole-slide pathological images. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2023, p. 19830–9.
[26]
Sharma Y., Shrivastava A., Ehsan L., Moskaluk C.A., Syed S., Brown D., Cluster-to-conquer: A framework for end-to-end multi-instance learning for whole slide image classification, in: Medical imaging with deep learning, PMLR, 2021, pp. 682–698.
[27]
Gadermayr M., Koller L., Tschuchnig M., Stangassinger L.M., Kreutzer C., Couillard-Despres S., et al., Mixup-mil: Novel data augmentation for multiple instance learning and a study on thyroid cancer diagnosis, in: International conference on medical image computing and computer-assisted intervention, Springer, 2023, pp. 477–486.
[28]
Gevaert O., Smet F.D., Timmerman D., Moreau Y., Moor B.D., Predicting the prognosis of breast cancer by integrating clinical and microarray data with Bayesian networks, Bioinformatics 22 (14) (2006) e184–e190.
[29]
Xu X., Zhang Y., Zou L., Wang M., Li A., A gene signature for breast cancer prognosis using support vector machine, in: 2012 5th international conference on biomedical engineering and informatics, IEEE, 2012, pp. 928–931.
[30]
Cheng J., Zhang J., Han Y., Wang X., Ye X., Meng Y., et al., Integrative analysis of histopathological images and genomic data predicts clear cell renal cell carcinoma prognosis, Cancer Res 77 (21) (2017) e91–e100.
[31]
Shao W., Han Z., Cheng J., Cheng L., Wang T., Sun L., et al., Integrative analysis of pathological images and multi-dimensional genomic data for early-stage cancer prognosis, IEEE Trans Med Imaging 39 (1) (2019) 99–110.
[32]
Li H., Chen L., Zeng H., Liao Q., Ji J., Ma X., Integrative analysis of histopathological images and genomic data in colon adenocarcinoma, Front Oncol 11 (2021).
[33]
Sun D., Li A., Tang B., Wang M., Integrating genomic data and pathological images to effectively predict breast cancer clinical outcome, Comput Methods Programs Biomed 161 (2018) 45–53.
[34]
Mobadersany P., Yousefi S., Amgad M., Gutman D.A., Barnholtz-Sloan J.S., Velázquez Vega J.E., et al., Predicting cancer outcomes from histology and genomics using convolutional networks, Proc Natl Acad Sci 115 (13) (2018) E2970–E2979.
[35]
Chen R.J., Lu M.Y., Wang J., Williamson D.F., Rodig S.J., Lindeman N.I., et al., Pathomic fusion: an integrated framework for fusing histopathology and genomic features for cancer diagnosis and prognosis, IEEE Trans Med Imaging 41 (4) (2020) 757–770.
[36]
Tan K., Huang W., Liu X., Hu J., Dong S., A multi-modal fusion framework based on multi-task correlation learning for cancer prognosis prediction, Artif Intell Med 126 (2022).
[37]
Chen RJ, Lu MY, Weng WH, Chen TY, Williamson DF, Manz T, et al. Multimodal co-attention transformer for survival prediction in gigapixel whole slide images. In: Proceedings of the IEEE/CVF international conference on computer vision. 2021, p. 4015–25.
[38]
Xu Y, Chen H. Multimodal optimal transport-based co-attention transformer with global structure consistency for survival prediction. In: Proceedings of the IEEE/CVF international conference on computer vision. 2023, p. 21241–51.
[39]
Zhou F, Chen H. Cross-modal translation and alignment for survival analysis. In: Proceedings of the IEEE/CVF international conference on computer vision. 2023, p. 21485–94.
[40]
Liu Z, Hu H, Lin Y, Yao Z, Xie Z, Wei Y, et al. Swin transformer v2: Scaling up capacity and resolution. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2022, p. 12009–19.
[41]
Chen T., Kornblith S., Norouzi M., Hinton G., A simple framework for contrastive learning of visual representations, in: International conference on machine learning, PMLR, 2020, pp. 1597–1607.
[42]
Fang Z., Wang J., Wang L., Zhang L., Yang Y., Liu Z., Seed: Self-supervised distillation for visual representation, 2021, arXiv preprint arXiv:2101.04731.
[43]
Wang X., Qi G.J., Contrastive learning with stronger augmentations, IEEE Trans Pattern Anal Mach Intell 45 (5) (2022) 5549–5560.
[44]
Lu Z., Lu M., Xia Y., M 2 F: A multi-modal and multi-task fusion network for glioma diagnosis and prognosis, in: International workshop on multiscale multimodal medical imaging, Springer, 2022, pp. 1–10.
[45]
Simonyan K., Zisserman A., Very deep convolutional networks for large-scale image recognition, 2014, arXiv preprint arXiv:1409.1556.
[46]
He K, Zhang X, Ren S, Sun J. Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition. 2016, p. 770–8.
[47]
Deng J., Dong W., Socher R., Li L.J., Li K., Fei-Fei L., Imagenet: A large-scale hierarchical image database, in: 2009 IEEE conference on computer vision and pattern recognition, Ieee, 2009, pp. 248–255.
[48]
Dosovitskiy A., Beyer L., Kolesnikov A., Weissenborn D., Zhai X., Unterthiner T., et al., An image is worth 16x16 words: Transformers for image recognition at scale, 2020, arXiv preprint arXiv:2010.11929.
[49]
Rich J.T., Neely J.G., Paniello R.C., Voelker C.C., Nussenbaum B., Wang E.W., A practical guide to understanding Kaplan-Meier curves, Otolaryngol—Head Neck Surg 143 (3) (2010) 331–336.
[50]
Bland J.M., Altman D.G., The logrank test, Bmj 328 (7447) (2004) 1073.
[51]
Benjamini Y., Hochberg Y., Controlling the false discovery rate: a practical and powerful approach to multiple testing, J R Stat Soc Ser B (Methodological) 57 (1) (1995) 289–300.
[52]
Das J., Yu H., HINT: High-quality protein interactomes and their applications in understanding human disease, BMC Syst Biol 6 (1) (2012) 1–12.
[53]
Kipf T.N., Welling M., Semi-supervised classification with graph convolutional networks, 2016, CoRR abs/1609.02907. URL http://arxiv.org/abs/1609.02907.
[54]
Agarap A.F., Deep learning using rectified linear units (relu), 2018, arXiv preprint arXiv:1803.08375.
[55]
Cerami E., Gao J., Dogrusoz U., Gross B.E., Sumer S.O., Aksoy B.A., et al., The cBio cancer genomics portal: An open platform for exploring multidimensional cancer genomics data, Cancer Discov 2 (5) (2012) 401–404.
[56]
Gao J., Aksoy B.A., Dogrusoz U., Dresdner G., Gross B., Sumer S.O., et al., Integrative analysis of complex cancer genomics and clinical profiles using the cBioPortal, Sci Signal 6 (269) (2013) pl1.
[57]
Guo J, Han K, Wu H, Tang Y, Chen X, Wang Y, et al. Cmt: Convolutional neural networks meet vision transformers. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2022, p. 12175–85.
[58]
Zhu L., Liao B., Zhang Q., Wang X., Liu W., Wang X., Vision mamba: Efficient visual representation learning with bidirectional state space model, 2024, arXiv preprint arXiv:2401.09417.
[59]
Shafi O., Siddiqui G., Tracing the origins of glioblastoma by investigating the role of gliogenic and related neurogenic genes/signaling pathways in GBM development: a systematic review, World J Surg Oncol 20 (1) (2022) 146.
[60]
Zhao S., Chen B., Chang H., Chen B., Li S., Reasoning discriminative dictionary-embedded network for fully automatic vertebrae tumor diagnosis, Med Image Anal 79 (2022).

Index Terms

  1. SG-Fusion: A swin-transformer and graph convolution-based multi-modal deep neural network for glioma prognosis
            Index terms have been assigned to the content through auto-classification.

            Recommendations

            Comments

            Information & Contributors

            Information

            Published In

            cover image Artificial Intelligence in Medicine
            Artificial Intelligence in Medicine  Volume 157, Issue C
            Nov 2024
            404 pages

            Publisher

            Elsevier Science Publishers Ltd.

            United Kingdom

            Publication History

            Published: 01 November 2024

            Author Tags

            1. Histopathological images
            2. Genomic data
            3. Tumor diagnosis
            4. Multimodal learning
            5. Contrastive learning

            Qualifiers

            • Research-article

            Contributors

            Other Metrics

            Bibliometrics & Citations

            Bibliometrics

            Article Metrics

            • 0
              Total Citations
            • 0
              Total Downloads
            • Downloads (Last 12 months)0
            • Downloads (Last 6 weeks)0
            Reflects downloads up to 06 Feb 2025

            Other Metrics

            Citations

            View Options

            View options

            Figures

            Tables

            Media

            Share

            Share

            Share this Publication link

            Share on social media