Image and Text Feature Based Multimodal Learning for Multi-Label Classification of Radiology Images in Biomedical Literature

Md. Hasan; Md Jani; Md Rahman

Research.Publish.Connect.

*Please fill out at least one Field. *Value must be an number!

Title:
ISBN:
Year:
Acronym:
Subject:

Advanced Search Proceedings Search

If you're looking for an exact phrase use quotation marks on text fields.

*Please fill out at least one Field.

Title:
Author:
Affiliation:
Subject:

Advanced Search Papers Search

If you're looking for an exact phrase use quotation marks on text fields.

*Please fill out at least one Field.

Name:
Affiliation:
Country:
Conference:
Subject:

Advanced Search Authors Search

If you're looking for an exact phrase use quotation marks on text fields.

*Please fill out at least one Field.

Name:
Country:
Subject:

Advanced Search Affiliations Search

If you're looking for an exact phrase use quotation marks on text fields.

Proceedings

Proceedings Search *Please fill out at least one Field. *Value must be an number!

Title:
ISBN:
Year:
Acronym:
Subject:

Advanced Search Proceedings Search

If you're looking for an exact phrase use quotation marks on text fields.

Papers

Papers Search *Please fill out at least one Field.

Title:
Author:
Affiliation:
Subject:

Advanced Search Papers Search

If you're looking for an exact phrase use quotation marks on text fields.

Authors

Authors Search *Please fill out at least one Field.

Name:
Affiliation:
Country:
Conference:
Subject:

Advanced Search Authors Search

If you're looking for an exact phrase use quotation marks on text fields.

Advanced Search

Paper

Image and Text Feature Based Multimodal Learning for Multi-Label Classification of Radiology Images in Biomedical Literature

Topics: Medical Informatics; Pattern Recognition and Machine Learning

In Proceedings of the 17th International Joint Conference on Biomedical Engineering Systems and Technologies - Volume 2: , 679-686, 2024 , Rome, Italy

Authors: Md. Hasan ; Md Jani and Md Rahman

Affiliation: Computer Science Department, Morgan State University, Baltimore, Maryland, U.S.A.

Keyword(s): Biomedical Image Annotation, Image Retrieval, Multimodal Learning, ResNet50, ViT, CNN, DistilGPT2.

Abstract: Biomedical images are crucial for diagnosing and planning treatments, as well as advancing scientific understanding of various ailments. To effectively highlight regions of interest (RoIs) and convey medical concepts, annotation markers like arrows, letters, or symbols are employed. However, annotating these images with appropriate medical labels poses a significant challenge. In this study, we propose a framework that leverages multimodal input features, including text/label features and visual features, to facilitate accurate annotation of biomedical images with multiple labels. Our approach integrates state-of-the-art models such as ResNet50 and Vision Transformers (ViT) to extract informative features from the images. Additionally, we employ Generative Pre-trained Distilled-GPT2 (Transformer based Natural Language Processing architecture) to extract textual features, leveraging their natural language understanding capabilities. This combination of image and text modalities allows for a more comprehensive representation of the biomedical data, leading to improved annotation accuracy. By combining the features extracted from both image and text modalities, we trained a simplified Convolutional Neural Network (CNN) based multi-classifier to learn the image-text relations and predict multi-labels for multi-modal radiology images. We used ImageCLEFmedical 2022 and 2023 datasets to demonstrate the effectiveness of our framework. This dataset likely contains a diverse range of biomedical images, enabling the evaluation of the framework’s performance under realistic conditions. We have achieved promising results with the F1 score of 0.508. Our proposed framework exhibits potential performance in annotating biomedical images with multiple labels, contributing to improved image understanding and analysis in the medical image processing domain. (More)

CC BY-NC-ND 4.0

Guest: Register as new SciTePress user now for free.

SciTePress user: please login.

My Papers

You are not signed in, therefore limits apply to your IP address 74.48.170.251

In the current month:

Recent papers: 100 available of 100 total

2⁺ years older papers: 200 available of 200 total

Paper citation in several formats:

Hasan, M., Jani, M. and Rahman, M. (2024). Image and Text Feature Based Multimodal Learning for Multi-Label Classification of Radiology Images in Biomedical Literature. In Proceedings of the 17th International Joint Conference on Biomedical Engineering Systems and Technologies - HEALTHINF; ISBN 978-989-758-688-0; ISSN 2184-4305, SciTePress, pages 679-686. DOI: 10.5220/0012438400003657

@conference{healthinf24,
author={Md. Hasan and Md Jani and Md Rahman},
title={Image and Text Feature Based Multimodal Learning for Multi-Label Classification of Radiology Images in Biomedical Literature},
booktitle={Proceedings of the 17th International Joint Conference on Biomedical Engineering Systems and Technologies - HEALTHINF},
year={2024},
pages={679-686},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0012438400003657},
isbn={978-989-758-688-0},
issn={2184-4305},
}

TY - CONF

JO - Proceedings of the 17th International Joint Conference on Biomedical Engineering Systems and Technologies - HEALTHINF
TI - Image and Text Feature Based Multimodal Learning for Multi-Label Classification of Radiology Images in Biomedical Literature
SN - 978-989-758-688-0
IS - 2184-4305
AU - Hasan, M.
AU - Jani, M.
AU - Rahman, M.
PY - 2024
SP - 679
EP - 686
DO - 10.5220/0012438400003657
PB - SciTePress