research-article

Open access

Temporal Matching Kernel with Explicit Feature Maps

Authors:

Sébastien Poullot,

Shunsuke Tsukatani,

Anh Phuong Nguyen,

Shin'Ichi SatohAuthors Info & Claims

MM '15: Proceedings of the 23rd ACM international conference on Multimedia

Pages 381 - 390

https://doi.org/10.1145/2733373.2806228

Published: 13 October 2015 Publication History

Abstract

This paper proposes a framework for content-based video retrieval that addresses various tasks as particular event retrieval, copy detection or video synchronization. Given a video query, the method is able to efficiently retrieve, from a large collection, similar video events or near-duplicates with temporarily consistent excerpts. As a byproduct of the representation, it provides a precise temporal alignment of the query and the detected video excerpts.

Our method converts a series of frame descriptors into a single visual-temporal descriptor, called a temporal invariant match kernel. This representation takes into account the relative positions of the visual frames: the frame descriptors are jointly encoded with their timestamps. When matching two videos, the method produces a score function for all possible relative timestamps, which is maximized to obtain both the similarity score and the relative time offset.

Then, we propose two complementary contributions to further improve the detection and localization performance.The first is a novel query expansion method that takes advantage of the joint descriptor/timestamp representation to automatically align the first result set and produce an enriched temporal query. In contrast to other query expansion methods proposed for videos, it preserves the localization capability. Second, we improve the localization trade-off between quality and representation size by using several complementary temporal match kernels.

We evaluate our approach on benchmarks for particular event retrieval, copy detection and video synchronization. Our experiments show that our approach achieve excellent detection and localization results.

References

[1]

R. Arandjelovic and A. Zisserman. Three things everyone should know to improve object retrieval. In CVPR, Jun. 2012.

Digital Library

[2]

R. Arandjelovic and A. Zisserman. All about VLAD. In CVPR, Jun. 2013.

Digital Library

[3]

L. Bo, X. Ren, and D. Fox. Kernel descriptors for visual recognition. In NIPS, Dec. 2010.

Digital Library

[4]

L. Bo and C. Sminchisescu. Efficient match kernel between sets of features for visual recognition. In NIPS, Dec. 2009.

Digital Library

[5]

O. Chum, A. Mikulik, M. Perdoch, and J. Matas. Total recall II: Query expansion revisited. In CVPR, Jun. 2011.

Digital Library

[6]

O. Chum, J. Philbin, J. Sivic, M. Isard, and A. Zisserman. Total recall: Automatic query expansion with a generative feature model for object retrieval. In ICCV, Oct. 2007.

[7]

M. Douze, H. Jégou, C. Schmid, and P. Pérez. Compact video description for copy detection with precise temporal alignment. In ECCV, Sep. 2010.

Digital Library

[8]

M. Douze, J. Revaud, C. Schmid, and H. Jégou. Stable hyper-pooling and query expansion for event detection. In ICCV, Dec. 2013.

Digital Library

[9]

M. K. et al. The visual object tracking VOT2014 challenge results. In ICCV Workshops, Jun. 2014.

[10]

J. F. Henriques, R. Caseiro, P. Martins, and J. Batista. Exploiting the circulant structure of tracking-by-detection with kernels. In ECCV, Oct. 2012.

Digital Library

[11]

J. F. Henriques, R. Caseiro, P. Martins, and J. Batista. High-speed tracking with kernelized correlation filters. Trans. PAMI, 2015. to appear.

[12]

H. Jégou and O. Chum. Negative evidences and co-occurences in image retrieval: The benefit of PCA and whitening. In ECCV, Oct. 2012.

[13]

H. Jégou, F. Perronnin, M. Douze, J. Sánchez, P. Pérez, and C. Schmid. Aggregating local descriptors into compact codes. In Trans. PAMI, Sep. 2012.

Digital Library

[14]

H. Jégou and A. Zisserman. Triangulation embedding and democratic kernels for image search. In CVPR, Jun. 2014.

Digital Library

[15]

J. Law-To, L. Chen, A. Joly, I. Laptev, O. Buisson, V. Gouet-Brunet, N. Boujemaa, and F. Stentiford. Video copy detection: a comparative study. In CIVR, pages 371--378, 2007.

Digital Library

[16]

F. Perronnin and C. R. Dance. Fisher kernels on visual vocabularies for image categorization. In CVPR, Jun. 2007.

[17]

F. Perronnin, J. Sánchez, and T. Mensink. Improving the Fisher kernel for large-scale image classification. In ECCV, Sep. 2010.

Digital Library

[18]

J. Philbin, O. Chum, M. Isard, J. Sivic, and A. Zisserman. Lost in quantization: Improving particular object retrieval in large scale image databases. In CVPR, Jun. 2008.

[19]

J. Revaud, M. Douze, C. Schmid, and H. Jégou. Event retrieval in large video collections with circulant temporal encoding. In CVPR, Jun. 2013.

Digital Library

[20]

A. F. Smeaton, P. Over, and W. Kraaij. Evaluation campaigns and trecvid. In MIR, pages 321--330, 2006.

Digital Library

[21]

G. Tolias, T. Furon, and H. Jégou. Orientation covariant aggregation of local descriptors with embeddings. In ECCV, Sep. 2014.

[22]

G. Tolias and H. Jégou. Visual query expansion with or without geometry: refining local descriptors by feature aggregation. Pattern Recognition, Apr. 2014.

[23]

A. Vedaldi and A. Zisserman. Efficient additive kernels via explicit feature maps. Trans. PAMI, 34(3):480--492, Mar. 2012.

Digital Library

[24]

J. Wang, J. Yang, F. L. K. Yu, T. Huang, and Y. Gong. Locality-constrained linear coding for image classification. In CVPR, Jun. 2010.

[25]

M.-C. Yeh and K.-T. Cheng. Video copy detection by fast sequence matching. In CIVR, 2009.

Digital Library

Cited By

Liu YXu QWen PDai SHuang QCai JKankanhalli MPrabhakaran BBoll SSubramanian RZheng LSingh VCesar PXie LXu D(2024)Not All Pairs are Equal: Hierarchical Learning for Average-Precision-Oriented Video RetrievalProceedings of the 32nd ACM International Conference on Multimedia10.1145/3664647.3681110(3828-3837)Online publication date: 28-Oct-2024
https://dl.acm.org/doi/10.1145/3664647.3681110
Pizzi EKordopatis-Zilos GPatel HPostelnicu GNagavara Ravindra SGupta APapadopoulos STolias GDouze M(2024)The 2023 video similarity dataset and challengeComputer Vision and Image Understanding10.1016/j.cviu.2024.103997243(103997)Online publication date: Jun-2024
https://doi.org/10.1016/j.cviu.2024.103997
Zachariah ARao P(2023)Video Retrieval for Everyday Scenes With Common ObjectsProceedings of the 2023 ACM International Conference on Multimedia Retrieval10.1145/3591106.3592239(565-570)Online publication date: 12-Jun-2023
https://dl.acm.org/doi/10.1145/3591106.3592239
Show More Cited By

Index Terms

Temporal Matching Kernel with Explicit Feature Maps
1. Information systems
  1. Information retrieval
    1. Retrieval models and ranking
    2. Retrieval tasks and goals
      1. Document filtering
      2. Information extraction

Recommendations

Circulant Temporal Encoding for Video Retrieval and Temporal Alignment

We address the problem of specific video event retrieval. Given a query video of a specific event, e.g., a concert of Madonna, the goal is to retrieve other videos of the same event that temporally overlap with the query. Our approach encodes the frame ...
Using Kernel Basis with Relevance Vector Machine for Feature Selection
ICANN '09: Proceedings of the 19th International Conference on Artificial Neural Networks: Part II

This paper presents an application of multiple kernels like Kernel Basis to the Relevance Vector Machine algorithm. The framework of kernel machines has been a source of many works concerning the merge of various kernels to build the solution. Within ...
Temporal-Based Video Event Detection and Retrieval

The fast proliferation of video data archives has increased the need for automatic video content analysis and semantic video retrieval. Since temporal information is critical in conveying video content, in this chapter, an effective temporal-based event ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

MM '15: Proceedings of the 23rd ACM international conference on Multimedia

October 2015

1402 pages

ISBN:9781450334594

DOI:10.1145/2733373

General Chairs:
Xiaofang Zhou
The University of Queensland, Australia
,
Alan F. Smeaton
Dublin City University, Ireland
,
Qi Tian
The University of Texas at San Antonio, USA
,
Program Chairs:
Dick C.A. Bulterman
FXPAL, USA
,
Heng Tao Shen
The University of Queensland, Australia
,
Ketan Mayer-Patel
The University of North Carolina, USA
,
Shuicheng Yan
National University of Singapore, Singapore

Copyright © 2015 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGMM: ACM Special Interest Group on Multimedia

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 13 October 2015

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

ERC project Viamass
KAKENHI

Conference

MM '15

Sponsor:

SIGMM

MM '15: ACM Multimedia Conference

October 26 - 30, 2015

Brisbane, Australia

Acceptance Rates

MM '15 Paper Acceptance Rate 56 of 252 submissions, 22%;

Overall Acceptance Rate 2,145 of 8,556 submissions, 25%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

27
Total Citations
View Citations
997
Total Downloads

Downloads (Last 12 months)123
Downloads (Last 6 weeks)17

Reflects downloads up to 06 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

Liu YXu QWen PDai SHuang QCai JKankanhalli MPrabhakaran BBoll SSubramanian RZheng LSingh VCesar PXie LXu D(2024)Not All Pairs are Equal: Hierarchical Learning for Average-Precision-Oriented Video RetrievalProceedings of the 32nd ACM International Conference on Multimedia10.1145/3664647.3681110(3828-3837)Online publication date: 28-Oct-2024
https://dl.acm.org/doi/10.1145/3664647.3681110
Pizzi EKordopatis-Zilos GPatel HPostelnicu GNagavara Ravindra SGupta APapadopoulos STolias GDouze M(2024)The 2023 video similarity dataset and challengeComputer Vision and Image Understanding10.1016/j.cviu.2024.103997243(103997)Online publication date: Jun-2024
https://doi.org/10.1016/j.cviu.2024.103997
Zachariah ARao P(2023)Video Retrieval for Everyday Scenes With Common ObjectsProceedings of the 2023 ACM International Conference on Multimedia Retrieval10.1145/3591106.3592239(565-570)Online publication date: 12-Jun-2023
https://dl.acm.org/doi/10.1145/3591106.3592239
Li XWang ZLiu YFan QChen J(2023)A Secure Client Video Deduplication Scheme Based on 3D CNN2023 2nd International Conference on Machine Learning, Cloud Computing and Intelligent Mining (MLCCIM)10.1109/MLCCIM60412.2023.00030(165-176)Online publication date: 25-Jul-2023
https://doi.org/10.1109/MLCCIM60412.2023.00030
Black AJenni SBui TTanjim MPetrangeli SSinha RSwaminathan VCollomosse J(2023)VADER: Video Alignment Differencing and Retrieval2023 IEEE/CVF International Conference on Computer Vision (ICCV)10.1109/ICCV51070.2023.02043(22300-22310)Online publication date: 1-Oct-2023
https://doi.org/10.1109/ICCV51070.2023.02043
Kordopatis-Zilos GTolias GTzelepis CKompatsiaris IPatras IPapadopoulos S(2023)Self-Supervised Video Similarity Learning2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)10.1109/CVPRW59228.2023.00504(4756-4766)Online publication date: Jun-2023
https://doi.org/10.1109/CVPRW59228.2023.00504
Jo WLim GHwang YLee GKim JYun JJung JChoi Y(2023)Simultaneous Video Retrieval and AlignmentIEEE Access10.1109/ACCESS.2023.325973311(28466-28478)Online publication date: 2023
https://doi.org/10.1109/ACCESS.2023.3259733
Tan WGuo HLiu R(2022)A Fast Partial Video Copy Detection Using KNN and Global Feature Database2022 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)10.1109/WACV51458.2022.00053(459-467)Online publication date: Jan-2022
https://doi.org/10.1109/WACV51458.2022.00053
Kordopatis-Zilos GTzelepis CPapadopoulos SKompatsiaris IPatras I(2022)DnS: Distill-and-Select for Efficient and Accurate Video Indexing and RetrievalInternational Journal of Computer Vision10.1007/s11263-022-01651-3130:10(2385-2407)Online publication date: 5-Aug-2022
https://doi.org/10.1007/s11263-022-01651-3
Jiang CHuang KHe SYang XZhang WZhang XCheng YYang LWang QXu FPan TChu WShen HZhuang YSmith JYang YCesar PMetze FPrabhakaran B(2021)Learning Segment Similarity and Alignment in Large-Scale Content Based Video RetrievalProceedings of the 29th ACM International Conference on Multimedia10.1145/3474085.3475301(1618-1626)Online publication date: 17-Oct-2021
https://dl.acm.org/doi/10.1145/3474085.3475301
Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Table of Contents