Search
Search Results
-
Bilingual video captioning model for enhanced video retrieval
Many video platforms rely on the descriptions that uploaders provide for video retrieval. However, this reliance may cause inaccuracies. Although...
-
Self-expressive induced clustered attention for video-text retrieval
Extensive research has proven that self-attention achieves impressive performance in video-text retrieval. However, most state-of-the-art methods...
-
Learning Text-to-Video Retrieval from Image Captioning
We describe a protocol to study text-to-video retrieval training with unlabeled videos, where we assume (i) no access to labels for any videos, i.e.,...
-
Efficient text augmentation in latent space for video retrieval
With the popularity of video sharing applications and streaming platforms, video retrieval became an active research topic. The core technique behind...
-
LSECA: local semantic enhancement and cross aggregation for video-text retrieval
Recently video retrieval based on the pre-training models (e.g., CLIP) has achieved outstanding success. To further improve the search performance,...
-
Opposition-based optimized max pooled 3D convolutional features for action video retrieval
Key frame selection serves as a c bridge between raw video data and meaningful retrieval results. Effective key frame selection enhances the...
-
Attention-based deep supervised hashing for near duplicate video retrieval
With the explosive growth of video data on the Internet, near duplicate video retrieval (NDVR) has become an important and challenging issue in the...
-
Hierarchical bi-directional conceptual interaction for text-video retrieval
The large pre-trained vision-language models (VLMs) utilized in text-video retrieval have demonstrated strong cross image-text understanding ability....
-
MGSGA: Multi-grained and Semantic-Guided Alignment for Text-Video Retrieval
In the text-video retrieval task, the objective is to calculate the similarity between a text and a video, and rank the relevant candidates higher....
-
Multimodal video retrieval with CLIP: a user study
Recent machine learning advances demonstrate the effectiveness of zero-shot models trained on large amounts of data collected from the internet....
-
Particle swarm optimized deep spatio-temporal features for efficient video retrieval
In content-based video retrieval, the phases of video frame selection and 3-dimensional feature extraction are especially crucial. These stages...
-
SPSD: Similarity-preserving self-distillation for video–text retrieval
Most of existing methods solve cross-modal video and text retrieval via coarse-grained similarity computation based on global representations or...
-
Learning optimal deep prototypes for video retrieval systems with hybrid SVM-softmax layer
The research focuses on optimizing training time for video retrieval by producing optimized prototypes for a hybrid SVM-softmax regression...
-
EA-VTR: Event-Aware Video-Text Retrieval
Understanding the content of events occurring in the video and their inherent temporal logic is crucial for video-text retrieval. However,... -
An intelligent surgical video retrieval for computer vision enhancement in medical diagnosis using deep learning techniques
This paper addresses the challenge of efficiently retrieving surgical videos from large databases for computer vision enhancement in medical...
-
Video–text retrieval via multi-modal masked transformer and adaptive attribute-aware graph convolutional network
Despite significant advancements in deep learning-based video–text retrieval methods, three challenges persist: the alignment of fine-grained...
-
Deep learning for video-text retrieval: a review
Video-Text Retrieval (VTR) aims to search for the most relevant video related to the semantics in a given sentence, and vice versa. In general, this...
-
A multi-modal lecture video indexing and retrieval framework with multi-scale residual attention network and multi-similarity computation
Due to technological development, the mass production of video and its storage on the Internet has increased. This made a huge amount of videos to be...
-
MQuA: Multi-level Query-Video Augmentation for Multilingual Video Corpus Retrieval
Multilingual Video Corpus Retrieval (mVCR) aims to localize the most relevant videos in a large collection of untrimmed bilingual instructional... -
Text-video retrieval method based on enhanced self-attention and multi-task learning
The explosive growth of videos on the Internet makes it a great challenge to use texts to retrieve the videos we need. The general method of...