Emergent Correspondence from Image Diffusion

Tang, Luming; Jia, Menglin; Wang, Qianqian; Phoo, Cheng Perng; Hariharan, Bharath

Computer Science > Computer Vision and Pattern Recognition

arXiv:2306.03881 (cs)

[Submitted on 6 Jun 2023 (v1), last revised 6 Dec 2023 (this version, v2)]

Title:Emergent Correspondence from Image Diffusion

Authors:Luming Tang, Menglin Jia, Qianqian Wang, Cheng Perng Phoo, Bharath Hariharan

View PDF

Abstract:Finding correspondences between images is a fundamental problem in computer vision. In this paper, we show that correspondence emerges in image diffusion models without any explicit supervision. We propose a simple strategy to extract this implicit knowledge out of diffusion networks as image features, namely DIffusion FeaTures (DIFT), and use them to establish correspondences between real images. Without any additional fine-tuning or supervision on the task-specific data or annotations, DIFT is able to outperform both weakly-supervised methods and competitive off-the-shelf features in identifying semantic, geometric, and temporal correspondences. Particularly for semantic correspondence, DIFT from Stable Diffusion is able to outperform DINO and OpenCLIP by 19 and 14 accuracy points respectively on the challenging SPair-71k benchmark. It even outperforms the state-of-the-art supervised methods on 9 out of 18 categories while remaining on par for the overall performance. Project page: this https URL

Comments:	NeurIPS 2023. Project page: this https URL
Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2306.03881 [cs.CV]
	(or arXiv:2306.03881v2 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2306.03881

Submission history

From: Luming Tang [view email]
[v1] Tue, 6 Jun 2023 17:33:19 UTC (18,581 KB)
[v2] Wed, 6 Dec 2023 17:58:25 UTC (30,311 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:Emergent Correspondence from Image Diffusion

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:Emergent Correspondence from Image Diffusion

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators