Chen et al., 2024 - Google Patents

Dragonfly: Multi-Resolution Zoom Supercharges Large Visual-Language Model

Chen et al., 2024

Document ID: 12255755920595553974
Author: Chen K; Thapa R; Chalamala R; Athiwaratkun B; Song S; Zou J
Publication year: 2024
Publication venue: arXiv preprint arXiv:2406.00977

External Links

Cited by

Snippet

Recent advances in large multimodal models (LMMs) suggest that higher image resolution enhances the fine-grained understanding of image details, crucial for tasks such as visual commonsense reasoning and analyzing biomedical images. However, increasing input …

Continue reading at arxiv.org (PDF) (other versions)

241000238633 Odonata 0 title abstract description 77

Classifications

- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F19/00—Digital computing or data processing equipment or methods, specially adapted for specific applications
- G06F19/30—Medical informatics, i.e. computer-based analysis or dissemination of patient or disease data
- G06F19/32—Medical data management, e.g. systems or protocols for archival or communication of medical images, computerised patient records or computerised general medical references
- G06F19/321—Management of medical image data, e.g. communication or archiving systems such as picture archiving and communication systems [PACS] or related medical protocols such as digital imaging and communications in medicine protocol [DICOM]; Editing of medical image data, e.g. adding diagnosis information
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F19/00—Digital computing or data processing equipment or methods, specially adapted for specific applications
- G06F19/30—Medical informatics, i.e. computer-based analysis or dissemination of patient or disease data
- G06F19/34—Computer-assisted medical diagnosis or treatment, e.g. computerised prescription or delivery of medication or diets, computerised local control of medical devices, medical expert systems or telemedicine
- G06F19/345—Medical expert systems, neural networks or other automated diagnosis
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30244—Information retrieval; Database structures therefor; File system structures therefor in image databases
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/62—Methods or arrangements for recognition using electronic means
- G06K9/6217—Design or setup of recognition systems and techniques; Extraction of features in feature space; Clustering techniques; Blind source separation
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/62—Methods or arrangements for recognition using electronic means
- G06K9/6267—Classification techniques
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/30—Subject of image; Context of image processing
- G06T2207/30004—Biomedical image processing
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06Q—DATA PROCESSING SYSTEMS OR METHODS, SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL, SUPERVISORY OR FORECASTING PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL, SUPERVISORY OR FORECASTING PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q50/00—Systems or methods specially adapted for a specific business sector, e.g. utilities or tourism
- G06Q50/10—Services
- G06Q50/22—Health care, e.g. hospitals; Social work
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T3/00—Geometric image transformation in the plane of the image, e.g. from bit-mapped to bit-mapped creating a different image
- G06T3/0031—Geometric image transformation in the plane of the image, e.g. from bit-mapped to bit-mapped creating a different image for topological mapping of a higher dimensional structure on a lower dimensional surface
- G06T3/0037—Reshaping or unfolding a 3D tree structure onto a 2D plane
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K2209/00—Indexing scheme relating to methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints

Similar Documents

Publication	Publication Date	Title
Zhang et al.	2023	Biomedgpt: A unified and generalist biomedical generative pre-trained transformer for vision, language, and multimodal tasks
Chen et al.	2024	Dragonfly: Multi-Resolution Zoom Supercharges Large Visual-Language Model
CN105760874A (en)	2016-07-13	CT image processing system and method for pneumoconiosis
Lin et al.	2023	BATFormer: Towards boundary-aware lightweight transformer for efficient medical image segmentation
CN113506310A (en)	2021-10-15	Medical image processing method and device, electronic equipment and storage medium
Amara et al.	2022	COVIR: A virtual rendering of a novel NN architecture O-Net for COVID-19 Ct-scan automatic lung lesions segmentation
Liu et al.	2023	A systematic review of deep learning-based research on radiology report generation
Varçin et al.	2019	Diagnosis of lumbar spondylolisthesis via convolutional neural networks
Karam et al.	2021	A progressive and cross-domain deep transfer learning framework for wrist fracture detection
Le Van et al.	2021	Detecting lumbar implant and diagnosing scoliosis from vietnamese X-ray imaging using the pre-trained api models and transfer learning
Lin et al.	2023	Towards medical artificial general intelligence via knowledge-enhanced multimodal pretraining
Wu et al.	2023	K-diag: Knowledge-enhanced disease diagnosis in radiographic imaging
Jha et al.	2024	CT Liver Segmentation via PVT-based Encoding and Refined Decoding
Feng et al.	2022	Utransnet: Transformer within u-net for stroke lesion segmentation
Pelka et al.	2019	Branding-fusion of meta data and musculoskeletal radiographs for multi-modal diagnostic recognition
Dai et al.	2024	UniChest: Conquer-and-Divide Pre-training for Multi-Source Chest X-Ray Classification
Nguyen et al.	2023	Collaborative consultation doctors model: Unifying CNN and ViT for COVID-19 diagnostic
Jia et al.	2021	Amo-net: abdominal multi-organ segmentation in mri with a extend unet
Guo	2024	Applying Medical Language Models to Medical Image Analysis
Machado	2023	Mandible-focused osteoporosis risk assessment using dental panoramic radiography and artificial intelligence models
Dalla Serra et al.	2022	Improving Image Representations via MoCo Pre-training for Multimodal CXR Classification
Hamamci et al.	2024	Developing Generalist Foundation Models from a Multimodal Dataset for 3D Computed Tomography
Crespi	2023	Towards a comprehensive application of deep learning methods in medical imaging: semantic segmentation, features extraction, synthetic image generation
Ma et al.	2023	Symmetrical awareness network for cross-site ultrasound thyroid nodule segmentation
Si et al.	2023	Non-symmetrical Sibling-Stream Network with Adaptive Positional Encoding for Automatic Medical Report Generation