skip to main content
research-article

Generating lymphoma ultrasound image description with transformer model

Published: 01 May 2024 Publication History

Abstract

Lymphoma, the most prevalent hematologic tumor originating from the lymphatic hematopoietic system, can be accurately diagnosed using high-resolution ultrasound. Microscopic ultrasound performance enables clinicians to identify suspected tumors and subsequently obtain a definitive pathological diagnosis through puncture biopsy. However, the complex and diverse ultrasonographic manifestations of lymphoma pose challenges for accurate characterization by sonographers. To address these issues, this study proposes a Transformer-based model for generating descriptive ultrasound images of lymphoma, aiming to provide auxiliary guidance for ultrasound doctors during screening procedures. Specifically, deep stable learning is integrated into the model to eliminate feature dependencies by training sample weights. Additionally, a memory module is incorporated into the model decoder to enhance semantic information modeling in descriptions and utilize learned semantic tree branch structures for more detailed image depiction. Experimental results on an ultrasonic diagnosis dataset from Shanghai Ruijin Hospital demonstrate that our proposed model outperforms relevant methods in terms of prediction performance.

Highlights

Generate image descriptions for input lymphoma ultrasound images based on sequence structure.
Memory mechanisms implicitly model and remember during model generation.
Deep stable learning removes dependencies between features by training weights.
Interpretable analysis by means of attention maps of cross-attention layers in the visual decoder.

References

[1]
L.A. Hendricks, Z. Akata, M. Rohrbach, J. Donahue, B. Schiele, T. Darrell, Generating visual explanations, in: European Conference on Computer Vision, 2016.
[2]
O. Vinyals, A. Toshev, S. Bengio, D. Erhan, IEEE 2015 IEEE conference on computer vision and pattern recognition (CVPR) - BOSTON, ma, USA (2015.6.7-2015.6.12)], in: 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) - Show and Tell: A Neural Image Caption Generator, 2015, pp. 3156–3164.
[3]
K. Xu, J. Ba, R. Kiros, K. Cho, A. Courville, R. Salakhutdinov, R. Zemel, Y. Bengio, Show, attend and tell: neural image caption generation with visual attention, Computer Science (2015) 2048–2057.
[4]
B. Jing, P. Xie, E. Xing, On the Automatic Generation of Medical Imaging Reports, 2017.
[5]
Z. Zhang, Y. Xie, F. Xing, M. Mcgough, L. Yang, Mdnet: A Semantically and Visually Interpretable Medical Image Diagnosis Network, 2017.
[6]
G. Liu, T. Hsu, M. Mcdermott, W. Boag, M. Ghassemi, Clinically Accurate Chest X-Ray Report Generation, 2019.
[7]
C.Y. Li, X. Liang, Z. Hu, E.P. Xing, Hybrid Retrieval-Generation Reinforced Agent for Medical Image Report Generation, 2018.
[8]
Z. Shen, P. Cui, T. Zhang, K. Kunag, Stable learning via sample reweighting, Proceedings of the AAAI Conference on Artificial Intelligence, 34, 2020, pp. 5692–5699. 4.
[9]
J. Krause, J. Johnson, R. Krishna, L. Fei-Fei, IEEE 2017 IEEE conference on computer vision and pattern recognition (CVPR) - honolulu, hi (2017.7.21-2017.7.26, in: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) - a Hierarchical Approach for Generating Descriptive Image Paragraphs, 2017, pp. 3337–3345.
[10]
T. Yao, Y. Pan, Y. Li, M. Tao, Exploring Visual Relationship for Image Captioning, 2018.
[11]
S.J. Rennie, E. Marcheret, Y. Mroueh, J. Ross, V. Goel, Self-critical sequence training for image captioning, in: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016, pp. 1179–1195.
[12]
W. Xin, W. Chen, Y.F. Wang, W.Y. Wang, No metrics are perfect: adversarial reward learning for visual storytelling, in: Annual Meeting of the Association for Computational Linguistics, 2018, 2018.
[13]
S. Chen, Q. Jin, P. Wang, Q. Wu, Say as you wish: fine-grained control of image caption generation with abstract scene graphs, CVPR (2020).
[14]
Z. Zhang, P. Chen, M. McGough, F. Xing, C. Wang, M. Bui, Y. Xie, M. Sapkota, L. Cui, J. Dhillon, et al., Pathologist-level interpretable whole-slide cancer diagnosis with deep learning, Nat. Mach. Intell. 1 (5) (2019) 236.
[15]
A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A.N. Gomez, L. Kaiser, I. Polosukhin, Attention Is All You Need, 2017, arXiv.
[16]
M. Li, F. Wang, X. Chang, X. Liang, Auxiliary Signal-Guided Knowledge Encoder-Decoder for Medical Report Generation, 2020.
[17]
Zhanyu Wang, et al., A Self-Boosting Framework for Automated Radiographic Report Generation, Computer Vision and Pattern Recognition IEEE, 2021.
[18]
Fenglin Liu, et al., Exploring and Distilling Posterior and Prior Knowledge for Radiology Report Generation, 2021.
[19]
D. Han, N. Ibrahim, F. Lu, Y. Zhu, H. Du, A. AlZoubi, Automatic Detection of Thyroid Nodule Characteristics from 2D Ultrasound Images, 2024.
[20]
Shaokang Yang, Jianwei Niu, Jiyan Wu, Yong Wang, Xuefeng Liu, Qingfeng Li, Automatic ultrasound image report generation with adaptive multimodal attention mechanism, Neurocomputing (2021).
[21]
Xu Lu, Xiangjun Liu, Zhiwei Xiao, Shulian Zhang, Jun Huang, Chuan Yang, Shaopeng Liu, Self-supervised dual-head attentional bootstrap learning network for prostate cancer screening in transrectal ultrasound images, Comput. Biol. Med. (2023).
[22]
E. Kim, S. Kim, M. Seo, S. Yoon, Xprotonet: Diagnosis in Chest Radiography with Global and Local Explanations, 2021.
[23]
S. Gulshad, A. Smeulders, Counterfactual attribute-based visual explanations for classification, International Journal of Multimedia Information Retrieval (2021).
[24]
Y. Ge, Y. Xiao, Z. Xu, M. Zheng, Z. Wu, A Peek into the Reasoning of Neural Networks: Interpreting with Structural Visual Concepts, 2021.
[25]
X. Zhang, P. Cui, R. Xu, L. Zhou, Z. Shen, Deep Stable Learning for Out-Of-Distribution Generalization, 2021.
[26]
M. Takada, T. Suzuki, H. Fujisawa, Independently Interpretable Lasso: A New Regularizer for Sparse Regression with Uncorrelated Variables, 2017.
[27]
K. Kuang, R. Xiong, P. Cui, S. Athey, B. Li, Stable prediction with model misspecification and agnostic distribution shift, Proceedings of the AAAI Conference on Artificial Intelligence, 34, 2020, pp. 4485–4492. 4.
[28]
Hang Zhang, et al., ResNeSt: Split-Attention Networks, 2020.
[29]
Jason E. Weston, et al., END-TO-END MEMORY NETWORKS, 2017.
[30]
K. Fukumizu, A. Gretton, X. Sun, B. Schölkopf, Kernel measures of conditional dependence, in: Conference on Neural Information Processing Systems, 2008, pp. 489–496.
[31]
Eric V. Strobl, K. Zhang, S. Visweswaran, Approximate kernel-based conditional independence tests for fast non-parametric causal discovery, J. Causal Inference (2017).
[32]
Z. Chen, Y. Song, T.-H. Chang, X. Wan, Generating radiology reports via memory-driven transformer, in: Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing, 2020, Nov.
[33]
J. Lu, C. Xiong, D. Parikh, R. Socher, Knowing when to look: adaptive attention via a visual sentinel for image captioning, in: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017.
[34]
Jia Deng, et al., ImageNet: a large-scale hierarchical image database, in: Proc of IEEE Computer Vision & Pattern Recognition, 2009, pp. 248–255.
[35]
D. Smilkov, N. Thorat, B. Kim, F. Viégas, M. Wattenberg, Smoothgrad: Removing Noise by Adding Noise, 2017.
[36]
D. Chen, C. Zhang, Y. Dong, Generating lymphoma ultrasound image description with transformer model, Y. Wang, H. Ma, Y. Peng, Y. Liu, R. He (Eds.), Image and Graphics Technologies and Applications. IGTA 2022. Communications in Computer and Information Science, 1611, Springer, Singapore, 2022.

Recommendations

Comments

Information & Contributors

Information

Published In

cover image Computers in Biology and Medicine
Computers in Biology and Medicine  Volume 174, Issue C
May 2024
849 pages

Publisher

Pergamon Press, Inc.

United States

Publication History

Published: 01 May 2024

Author Tags

  1. Lymphoma
  2. Image description
  3. Transformer
  4. Deep stable learning

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 0
    Total Downloads
  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 23 Jan 2025

Other Metrics

Citations

View Options

View options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media