Article

VISIONE at Video Browser Showdown 2022

Authors:

Giuseppe Amato,

Paolo Bolettieri,

Fabrizio Falchi,

Claudio Gennaro,

Nicola Messina,

Lucia Vadicamo,

Claudio VairoAuthors Info & Claims

MultiMedia Modeling: 28th International Conference, MMM 2022, Phu Quoc, Vietnam, June 6–10, 2022, Proceedings, Part II

Pages 543 - 548

https://doi.org/10.1007/978-3-030-98355-0_52

Published: 06 June 2022 Publication History

Abstract

VISIONE is a content-based retrieval system that supports various search functionalities (text search, object/color-based search, semantic and visual similarity search, temporal search). It uses a full-text search engine as a search backend. In the latest version of our system, we modified the user interface, and we made some changes to the techniques used to analyze and search for videos.

References

[1]

Amato G et al. Kompatsiaris I, Huet B, Mezaris V, Gurrin C, Cheng W-H, Vrochidis S, et al. VISIONE at VBS2019 MultiMedia Modeling 2019 Cham Springer 591-596

[2]

Amato G et al. The VISIONE video search system: exploiting off-the-shelf text search engines for large-scale video retrieval J. Imaging 2021 7 5 76

[3]

Amato G, et al., et al. Lokoč J, et al., et al. VISIONE at video browser showdown 2021 MultiMedia Modeling 2021 Cham Springer 473-478

[4]

Benavente R, Vanrell M, and Baldrich R Parametric fuzzy sets for automatic color naming JOSA A 2008 25 10 2582-2593

[5]

Berlin B and Kay P Basic Color Terms: Their Universality and Evolution 1991 Berkeley University of California Press

[6]

Berns, F., Rossetto, L., Schoeffmann, K., Beecks, C., Awad, G.: V3C1 dataset: an evaluation of content characteristics. In: Proceedings of the 2019 on International Conference on Multimedia Retrieval, pp. 334–338. Association for Computing Machinery (2019)

[7]

Boynton RM and Olson CX Salience of chromatic basic color terms confirmed by three measures Vision. Res. 1990 30 9 1311-1317

[8]

Gordo A, Almazan J, Revaud J, and Larlus D End-to-end learning of deep visual representations for image retrieval Int. J. Comput. Vision 2017 124 2 237-254

[9]

Heller S, et al., et al. Lokoč J, et al., et al. Towards explainable interactive multi-modal video retrieval with vitrivr MultiMedia Modeling 2021 Cham Springer 435-440

[10]

Lokoč J et al. Is the reign of interactive search eternal? Findings from the video browser showdown 2020 ACM Trans. Multimed. Comput. Commun. Appl. 2021 17 3 1-26

[11]

Messina, N., Amato, G., Esuli, A., Falchi, F., Gennaro, C., Marchand-Maillet, S.: Fine-grained visual textual alignment for cross-modal retrieval using transformer encoders. arXiv preprint arXiv:2008.05231 (2020)

[12]

Messina, N., Falchi, F., Esuli, A., Amato, G.: Transformer reasoning network for image-text matching and retrieval. In: 2020 25th International Conference on Pattern Recognition (ICPR), pp. 5222–5229. IEEE (2021)

[13]

Peška L, Kovalčík G, Souček T, Škrhák V, Lokoč J, et al. Lokoč J et al. W2VV++ BERT model at VBS 2021 MultiMedia Modeling 2021 Cham Springer 467-472

[14]

Radford, A., et al.: Learning transferable visual models from natural language supervision. arXiv preprint arXiv:2103.00020 (2021)

[15]

Redmon, J., Farhadi, A.: YOLOv3: an incremental improvement. CoRR abs/1804.02767 (2018)

[16]

Revaud, J., Almazan, J., Rezende, R., de Souza, C.: Learning with average precision: training image retrieval with a listwise loss. In: International Conference on Computer Vision, pp. 5106–5115. IEEE (2019)

[17]

Rossetto L et al. Interactive video retrieval in the age of deep learning - detailed evaluation of VBS 2019 IEEE Trans. Multimedia 2020 23 243-256

[18]

Rossetto, L., Schoeffmann, K., Bernstein, A.: Insights on the V3C2 dataset. arXiv preprint arXiv:2105.01475 (2021)

[19]

Sturges J and Whitfield TA Salient features of munsell colour space as a function of monolexemic naming and response latencies Vision. Res. 1997 37 3 307-313

[20]

Van De Weijer J, Schmid C, Verbeek J, and Larlus D Learning color names for real-world applications IEEE Trans. Image Process. 2009 18 7 1512-1523

[21]

Zhang, H., Wang, Y., Dayoub, F., Sunderhauf, N.: VarifocalNet: an IoU-aware dense object detector. In: Conference on Computer Vision and Pattern Recognition, pp. 8514–8523. IEEE, June 2021

Cited By

Nguyen TPuangthamawathanakun BArpnikanondt CGurrin CCaputo AHealy G(2023)Efficient Search with an Interactive Video Retrieval System for Novice Users in IVR4BProceedings of the 20th International Conference on Content-based Multimedia Indexing10.1145/3617233.3617273(168-172)Online publication date: 20-Sep-2023
https://dl.acm.org/doi/10.1145/3617233.3617273
Lokoč JAndreadis SBailer WDuane AGurrin CMa ZMessina NNguyen TPeška LRossetto LSauter LSchall KSchoeffmann KKhan OSpiess FVadicamo LVrochidis S(2023)Interactive video retrieval in the age of effective joint embedding deep models: lessons from the 11th VBSMultimedia Systems10.1007/s00530-023-01143-529:6(3481-3504)Online publication date: 1-Dec-2023
https://dl.acm.org/doi/10.1007/s00530-023-01143-5
Nguyen TPuangthamawathanakun BCaputo AHealy GNguyen BArpnikanondt CGurrin C(2023)VideoCLIP: An Interactive CLIP-based Video Retrieval System at VBS2023MultiMedia Modeling10.1007/978-3-031-27077-2_57(671-677)Online publication date: 9-Jan-2023
https://dl.acm.org/doi/10.1007/978-3-031-27077-2_57
Show More Cited By

Index Terms

VISIONE at Video Browser Showdown 2022
1. Human-centered computing
  1. Human computer interaction (HCI)
2. Information systems
  1. Information retrieval
    1. Information retrieval query processing

Index terms have been assigned to the content through auto-classification.

Recommendations

VISIONE at Video Browser Showdown 2023
MultiMedia Modeling
Abstract
In this paper, we present the fourth release of VISIONE, a tool for fast and effective video search on a large-scale dataset. It includes several search functionalities like text search, object and color-based search, semantic and visual ...
VISIONE at Video Browser Showdown 2021
MultiMedia Modeling
Abstract
This paper presents the second release of VISIONE, a tool for effective video search on large-scale collections. It allows users to search for videos using textual descriptions, keywords, occurrence of objects and their spatial relationships, ...
VISIONE 5.0: Enhanced User Interface and AI Models for VBS2024
MultiMedia Modeling
Abstract
In this paper, we introduce the fifth release of VISIONE, an advanced video retrieval system offering diverse search functionalities. The user can search for a target video using textual prompts, drawing objects and colors appearing in the target ...

Comments

Information & Contributors

Information

Published In

cover image Guide Proceedings

MultiMedia Modeling: 28th International Conference, MMM 2022, Phu Quoc, Vietnam, June 6–10, 2022, Proceedings, Part II

Jun 2022

613 pages

ISBN:978-3-030-98354-3

DOI:10.1007/978-3-030-98355-0

Editors:
Björn Þór Jónsson
IT University of Copenhagen, Copenhagen, Denmark
,
Cathal Gurrin
Dublin City University, Dublin, Ireland
,
Minh-Triet Tran
University of Science, VNU-HCM, Ho Chi Minh City, Vietnam
,
Duc-Tien Dang-Nguyen
University of Bergen, Bergen, Norway
,
Anita Min-Chun Hu
National Tsing Hua University, Hsinchu, Taiwan
,
Binh Huynh Thi Thanh
Hanoi University of Science and Technology, Hanoi, Vietnam
,
Benoit Huet
Median Technologies, Valbonne, France

© Springer Nature Switzerland AG 2022.

Publisher

Springer-Verlag

Berlin, Heidelberg

Publication History

Published: 06 June 2022

Author Tags

Qualifiers

Article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

7
Total Citations
View Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 06 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

Nguyen TPuangthamawathanakun BArpnikanondt CGurrin CCaputo AHealy G(2023)Efficient Search with an Interactive Video Retrieval System for Novice Users in IVR4BProceedings of the 20th International Conference on Content-based Multimedia Indexing10.1145/3617233.3617273(168-172)Online publication date: 20-Sep-2023
https://dl.acm.org/doi/10.1145/3617233.3617273
Lokoč JAndreadis SBailer WDuane AGurrin CMa ZMessina NNguyen TPeška LRossetto LSauter LSchall KSchoeffmann KKhan OSpiess FVadicamo LVrochidis S(2023)Interactive video retrieval in the age of effective joint embedding deep models: lessons from the 11th VBSMultimedia Systems10.1007/s00530-023-01143-529:6(3481-3504)Online publication date: 1-Dec-2023
https://dl.acm.org/doi/10.1007/s00530-023-01143-5
Nguyen TPuangthamawathanakun BCaputo AHealy GNguyen BArpnikanondt CGurrin C(2023)VideoCLIP: An Interactive CLIP-based Video Retrieval System at VBS2023MultiMedia Modeling10.1007/978-3-031-27077-2_57(671-677)Online publication date: 9-Jan-2023
https://dl.acm.org/doi/10.1007/978-3-031-27077-2_57
Sauter LGasser RHeller SRossetto LSaladin CSpiess FSchuldt H(2023)Exploring Effective Interactive Text-Based Video Search in vitrivrMultiMedia Modeling10.1007/978-3-031-27077-2_53(646-651)Online publication date: 9-Jan-2023
https://dl.acm.org/doi/10.1007/978-3-031-27077-2_53
Lokoč JVopálková ZDokoupil PPeška L(2023)Video Search with CLIP and Interactive Text Query ReformulationMultiMedia Modeling10.1007/978-3-031-27077-2_50(628-633)Online publication date: 9-Jan-2023
https://dl.acm.org/doi/10.1007/978-3-031-27077-2_50
Amato GBolettieri PCarrara FFalchi FGennaro CMessina NVadicamo LVairo C(2023)VISIONE at Video Browser Showdown 2023MultiMedia Modeling10.1007/978-3-031-27077-2_48(615-621)Online publication date: 9-Jan-2023
https://dl.acm.org/doi/10.1007/978-3-031-27077-2_48
Carrara FVadicamo LGennaro CAmato G(2022)Approximate Nearest Neighbor Search on Standard Search EnginesSimilarity Search and Applications10.1007/978-3-031-17849-8_17(214-221)Online publication date: 5-Oct-2022
https://dl.acm.org/doi/10.1007/978-3-031-17849-8_17

View Options

View options

Media

Figures

Other

Tables

View Table of Contents