Contrastive Losses Are Natural Criteria for Unsupervised Video Summarization

Pang, Zongshang; Nakashima, Yuta; Otani, Mayu; Nagahara, Hajime

Computer Science > Computer Vision and Pattern Recognition

arXiv:2211.10056 (cs)

[Submitted on 18 Nov 2022]

Title:Contrastive Losses Are Natural Criteria for Unsupervised Video Summarization

Authors:Zongshang Pang, Yuta Nakashima, Mayu Otani, Hajime Nagahara

View PDF

Abstract:Video summarization aims to select the most informative subset of frames in a video to facilitate efficient video browsing. Unsupervised methods usually rely on heuristic training objectives such as diversity and representativeness. However, such methods need to bootstrap the online-generated summaries to compute the objectives for importance score regression. We consider such a pipeline inefficient and seek to directly quantify the frame-level importance with the help of contrastive losses in the representation learning literature. Leveraging the contrastive losses, we propose three metrics featuring a desirable key frame: local dissimilarity, global consistency, and uniqueness. With features pre-trained on the image classification task, the metrics can already yield high-quality importance scores, demonstrating competitive or better performance than past heavily-trained methods. We show that by refining the pre-trained features with a lightweight contrastively learned projection module, the frame-level importance scores can be further improved, and the model can also leverage a large number of random videos and generalize to test videos with decent performance. Code available at this https URL.

Comments:	To appear in WACV2023
Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2211.10056 [cs.CV]
	(or arXiv:2211.10056v1 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2211.10056

Submission history

From: Zongshang Pang [view email]
[v1] Fri, 18 Nov 2022 07:01:28 UTC (8,101 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:Contrastive Losses Are Natural Criteria for Unsupervised Video Summarization

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:Contrastive Losses Are Natural Criteria for Unsupervised Video Summarization

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators