Chen et al., 2024 - Google Patents
Dragonfly: Multi-Resolution Zoom Supercharges Large Visual-Language ModelChen et al., 2024
View PDF- Document ID
- 12255755920595553974
- Author
- Chen K
- Thapa R
- Chalamala R
- Athiwaratkun B
- Song S
- Zou J
- Publication year
- Publication venue
- arXiv preprint arXiv:2406.00977
External Links
Snippet
Recent advances in large multimodal models (LMMs) suggest that higher image resolution enhances the fine-grained understanding of image details, crucial for tasks such as visual commonsense reasoning and analyzing biomedical images. However, increasing input …
- 241000238633 Odonata 0 title abstract description 77
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F19/00—Digital computing or data processing equipment or methods, specially adapted for specific applications
- G06F19/30—Medical informatics, i.e. computer-based analysis or dissemination of patient or disease data
- G06F19/32—Medical data management, e.g. systems or protocols for archival or communication of medical images, computerised patient records or computerised general medical references
- G06F19/321—Management of medical image data, e.g. communication or archiving systems such as picture archiving and communication systems [PACS] or related medical protocols such as digital imaging and communications in medicine protocol [DICOM]; Editing of medical image data, e.g. adding diagnosis information
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F19/00—Digital computing or data processing equipment or methods, specially adapted for specific applications
- G06F19/30—Medical informatics, i.e. computer-based analysis or dissemination of patient or disease data
- G06F19/34—Computer-assisted medical diagnosis or treatment, e.g. computerised prescription or delivery of medication or diets, computerised local control of medical devices, medical expert systems or telemedicine
- G06F19/345—Medical expert systems, neural networks or other automated diagnosis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30244—Information retrieval; Database structures therefor; File system structures therefor in image databases
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/62—Methods or arrangements for recognition using electronic means
- G06K9/6217—Design or setup of recognition systems and techniques; Extraction of features in feature space; Clustering techniques; Blind source separation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/62—Methods or arrangements for recognition using electronic means
- G06K9/6267—Classification techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/30—Subject of image; Context of image processing
- G06T2207/30004—Biomedical image processing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06Q—DATA PROCESSING SYSTEMS OR METHODS, SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL, SUPERVISORY OR FORECASTING PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL, SUPERVISORY OR FORECASTING PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q50/00—Systems or methods specially adapted for a specific business sector, e.g. utilities or tourism
- G06Q50/10—Services
- G06Q50/22—Health care, e.g. hospitals; Social work
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T3/00—Geometric image transformation in the plane of the image, e.g. from bit-mapped to bit-mapped creating a different image
- G06T3/0031—Geometric image transformation in the plane of the image, e.g. from bit-mapped to bit-mapped creating a different image for topological mapping of a higher dimensional structure on a lower dimensional surface
- G06T3/0037—Reshaping or unfolding a 3D tree structure onto a 2D plane
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K2209/00—Indexing scheme relating to methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Zhang et al. | Biomedgpt: A unified and generalist biomedical generative pre-trained transformer for vision, language, and multimodal tasks | |
Chen et al. | Dragonfly: Multi-Resolution Zoom Supercharges Large Visual-Language Model | |
CN105760874A (en) | CT image processing system and method for pneumoconiosis | |
Lin et al. | BATFormer: Towards boundary-aware lightweight transformer for efficient medical image segmentation | |
CN113506310A (en) | Medical image processing method and device, electronic equipment and storage medium | |
Amara et al. | COVIR: A virtual rendering of a novel NN architecture O-Net for COVID-19 Ct-scan automatic lung lesions segmentation | |
Liu et al. | A systematic review of deep learning-based research on radiology report generation | |
Varçin et al. | Diagnosis of lumbar spondylolisthesis via convolutional neural networks | |
Karam et al. | A progressive and cross-domain deep transfer learning framework for wrist fracture detection | |
Le Van et al. | Detecting lumbar implant and diagnosing scoliosis from vietnamese X-ray imaging using the pre-trained api models and transfer learning | |
Lin et al. | Towards medical artificial general intelligence via knowledge-enhanced multimodal pretraining | |
Wu et al. | K-diag: Knowledge-enhanced disease diagnosis in radiographic imaging | |
Jha et al. | CT Liver Segmentation via PVT-based Encoding and Refined Decoding | |
Feng et al. | Utransnet: Transformer within u-net for stroke lesion segmentation | |
Pelka et al. | Branding-fusion of meta data and musculoskeletal radiographs for multi-modal diagnostic recognition | |
Dai et al. | UniChest: Conquer-and-Divide Pre-training for Multi-Source Chest X-Ray Classification | |
Nguyen et al. | Collaborative consultation doctors model: Unifying CNN and ViT for COVID-19 diagnostic | |
Jia et al. | Amo-net: abdominal multi-organ segmentation in mri with a extend unet | |
Guo | Applying Medical Language Models to Medical Image Analysis | |
Machado | Mandible-focused osteoporosis risk assessment using dental panoramic radiography and artificial intelligence models | |
Dalla Serra et al. | Improving Image Representations via MoCo Pre-training for Multimodal CXR Classification | |
Hamamci et al. | Developing Generalist Foundation Models from a Multimodal Dataset for 3D Computed Tomography | |
Crespi | Towards a comprehensive application of deep learning methods in medical imaging: semantic segmentation, features extraction, synthetic image generation | |
Ma et al. | Symmetrical awareness network for cross-site ultrasound thyroid nodule segmentation | |
Si et al. | Non-symmetrical Sibling-Stream Network with Adaptive Positional Encoding for Automatic Medical Report Generation |