×
Jan 24, 2024 · This benchmark comprises 530K meticulously curated image-text pairs, extracted from figures and tables with detailed captions in scientific ...
Aug 11, 2024 · Multi-modal information retrieval (MMIR) is a rapidly evolving field where significant progress has been made through advanced rep-.
This benchmark comprises 530K meticulously curated image-text pairs, extracted from figures and tables with detailed captions from scientific documents.We ...
In this paper, we propose a novel SciMMIR benchmark and a corresponding dataset designed to address the gap in evaluating multi-modal information retrieval ...
We conduct zero-shot and fine-tuned evaluations on prominent multi-modal image-captioning and visual language models, such as CLIP, BLIP, and BLIP-2.Our ...
To fill such a gap, we introduce SciMMIR, a Scientific Multi-Modal Information Retrieval benchmark to evaluate models' MMIR ability in the scientific domain.
View recent discussion. Abstract: Multi-modal information retrieval (MMIR) is a rapidly evolving field, where significant progress, particularly in ...
Jan 24, 2024 · This work develops a specialised scientific MMIR (SciMMIR) benchmark by leveraging open-access paper collections to extract data relevant to ...
People also ask
Nov 13, 2024 · Abstract: Multi-modal information retrieval (MMIR) is a rapidly evolving field, where significant progress, particularly in image-text pairing, ...
Jan 24, 2024 · To fill such a gap, we introduce SciMMIR, a Scientific Multi-Modal Information Retrieval benchmark to evaluate models' MMIR ability in the ...