CadVLM: Bridging Language and Vision in the Generation of Parametric CAD Sketches

Wu, Sifan; Khasahmadi, Amir; Katz, Mor; Jayaraman, Pradeep Kumar; Pu, Yewen; Willis, Karl; Liu, Bang

Computer Science > Computer Vision and Pattern Recognition

arXiv:2409.17457 (cs)

[Submitted on 26 Sep 2024]

Title:CadVLM: Bridging Language and Vision in the Generation of Parametric CAD Sketches

Authors:Sifan Wu, Amir Khasahmadi, Mor Katz, Pradeep Kumar Jayaraman, Yewen Pu, Karl Willis, Bang Liu

View PDF HTML (experimental)

Abstract:Parametric Computer-Aided Design (CAD) is central to contemporary mechanical design. However, it encounters challenges in achieving precise parametric sketch modeling and lacks practical evaluation metrics suitable for mechanical design. We harness the capabilities of pre-trained foundation models, renowned for their successes in natural language processing and computer vision, to develop generative models specifically for CAD. These models are adept at understanding complex geometries and design reasoning, a crucial advancement in CAD technology. In this paper, we propose CadVLM, an end-to-end vision language model for CAD generation. Our approach involves adapting pre-trained foundation models to manipulate engineering sketches effectively, integrating both sketch primitive sequences and sketch images. Extensive experiments demonstrate superior performance on multiple CAD sketch generation tasks such as CAD autocompletion, CAD autoconstraint, and image conditional generation. To our knowledge, this is the first instance of a multimodal Large Language Model (LLM) being successfully applied to parametric CAD generation, representing a pioneering step in the field of computer-aided mechanical design.

Subjects:	Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
Cite as:	arXiv:2409.17457 [cs.CV]
	(or arXiv:2409.17457v1 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2409.17457

Submission history

From: Sifan Wu [view email]
[v1] Thu, 26 Sep 2024 01:22:29 UTC (3,273 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:CadVLM: Bridging Language and Vision in the Generation of Parametric CAD Sketches

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:CadVLM: Bridging Language and Vision in the Generation of Parametric CAD Sketches

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators