ViewFormer: View Set Attention for Multi-view 3D Shape Understanding

Sun, Hongyu; Wang, Yongcai; Wang, Peng; Cai, Xudong; Li, Deying

Computer Science > Computer Vision and Pattern Recognition

arXiv:2305.00161 (cs)

[Submitted on 29 Apr 2023]

Title:ViewFormer: View Set Attention for Multi-view 3D Shape Understanding

Authors:Hongyu Sun, Yongcai Wang, Peng Wang, Xudong Cai, Deying Li

View PDF

Abstract:This paper presents ViewFormer, a simple yet effective model for multi-view 3d shape recognition and retrieval. We systematically investigate the existing methods for aggregating multi-view information and propose a novel ``view set" perspective, which minimizes the relation assumption about the views and releases the representation flexibility. We devise an adaptive attention model to capture pairwise and higher-order correlations of the elements in the view set. The learned multi-view correlations are aggregated into an expressive view set descriptor for recognition and retrieval. Experiments show the proposed method unleashes surprising capabilities across different tasks and datasets. For instance, with only 2 attention blocks and 4.8M learnable parameters, ViewFormer reaches 98.8% recognition accuracy on ModelNet40 for the first time, exceeding previous best method by 1.1% . On the challenging RGBD dataset, our method achieves 98.4% recognition accuracy, which is a 4.1% absolute improvement over the strongest baseline. ViewFormer also sets new records in several evaluation dimensions of 3D shape retrieval defined on the SHREC'17 benchmark.

Comments:	15 pages, 10 figures, 16 tables
Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2305.00161 [cs.CV]
	(or arXiv:2305.00161v1 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2305.00161

Submission history

From: Hongyu Sun [view email]
[v1] Sat, 29 Apr 2023 03:58:20 UTC (5,271 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:ViewFormer: View Set Attention for Multi-view 3D Shape Understanding

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:ViewFormer: View Set Attention for Multi-view 3D Shape Understanding

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators