B-repLer: Semantic B-rep Latent Editor using Large Language Models
By: Yilin Liu , Niladri Shekhar Dutt , Changjian Li and more
Potential Business Impact:
Lets computers change 3D designs with words.
Multimodal large language models (mLLMs), trained in a mixed modal setting as a universal model, have been shown to compete with or even outperform many specialized algorithms for imaging and graphics tasks. As demonstrated across many applications, mLLMs' ability to jointly process image and text data makes them suitable for zero-shot applications or efficient fine-tuning towards specialized tasks. However, they have had limited success in 3D analysis and editing tasks. This is due to both the lack of suitable (annotated) 3D data as well as the idiosyncrasies of 3D representations. In this paper, we investigate whether mLLMs can be adapted to support high-level editing of Boundary Representation (B-rep) CAD objects. B-reps remain the industry-standard for precisely encoding engineering objects, but are challenging as the representation is fragile (i.e. can easily lead to invalid CAD objects) and no publicly available data source exists with semantically-annotated B-reps or CAD construction history. We present B-repLer as a finetuned mLLM that can understand text prompts and make semantic edits on given B-Reps to produce valid outputs. We enable this via a novel multimodal architecture, specifically designed to handle B-rep models, and demonstrate how existing CAD tools, in conjunction with mLLMs, can be used to automatically generate the required reasoning dataset, without relying on external annotations. We extensively evaluate B-repLer and demonstrate several text-based B-rep edits of various complexity, which were not previously possible.
Similar Papers
BrepLLM: Native Boundary Representation Understanding with Large Language Models
CV and Pattern Recognition
Lets computers understand 3D shapes like we do.
Large Language Models as Visualization Agents for Immersive Binary Reverse Engineering
Human-Computer Interaction
VR and AI help understand computer code faster.
LLMs Can Also Do Well! Breaking Barriers in Semantic Role Labeling via Large Language Models
Computation and Language
Makes computers understand sentences better.