Text-to-CadQuery: A New Paradigm for CAD Generation with Scalable Large Model Capabilities
By: Haoyang Xie, Feng Ju
Potential Business Impact:
Turns words into 3D computer designs.
Computer-aided design (CAD) is fundamental to modern engineering and manufacturing, but creating CAD models still requires expert knowledge and specialized software. Recent advances in large language models (LLMs) open up the possibility of generative CAD, where natural language is directly translated into parametric 3D models. However, most existing methods generate task-specific command sequences that pretrained models cannot directly handle. These sequences must be converted into CAD representations such as CAD vectors before a 3D model can be produced, which requires training models from scratch and adds unnecessary complexity. To tackle this issue, we propose generating CadQuery code directly from text, leveraging the strengths of pretrained LLMs to produce 3D models without intermediate representations, using this Python-based scripting language. Since LLMs already excel at Python generation and spatial reasoning, fine-tuning them on Text-to-CadQuery data proves highly effective. Given that these capabilities typically improve with scale, we hypothesize that larger models will perform better after fine-tuning. To enable this, we augment the Text2CAD dataset with 170,000 CadQuery annotations. We fine-tune six open-source LLMs of varying sizes and observe consistent improvements. Our best model achieves a top-1 exact match of 69.3%, up from 58.8%, and reduces Chamfer Distance by 48.6%. Project page: https://github.com/Text-to-CadQuery/Text-to-CadQuery.
Similar Papers
CAD-Llama: Leveraging Large Language Models for Computer-Aided Design Parametric 3D Model Generation
CV and Pattern Recognition
Lets computers design 3D shapes from words.
Generative AI for CAD Automation: Leveraging Large Language Models for 3D Modelling
Human-Computer Interaction
Lets computers design things from your words.
CAD-Tokenizer: Towards Text-based CAD Prototyping via Modality-Specific Tokenization
Machine Learning (CS)
Computers build 3D designs from simple words.