Tokenizing Buildings: A Transformer for Layout Synthesis
By: Manuel Ladron de Guevara , Jinmo Rhee , Ardavan Bidgoli and more
Potential Business Impact:
Builds better building plans automatically.
We introduce Small Building Model (SBM), a Transformer-based architecture for layout synthesis in Building Information Modeling (BIM) scenes. We address the question of how to tokenize buildings by unifying heterogeneous feature sets of architectural elements into sequences while preserving compositional structure. Such feature sets are represented as a sparse attribute-feature matrix that captures room properties. We then design a unified embedding module that learns joint representations of categorical and possibly correlated continuous feature groups. Lastly, we train a single Transformer backbone in two modes: an encoder-only pathway that yields high-fidelity room embeddings, and an encoder-decoder pipeline for autoregressive prediction of room entities, referred to as Data-Driven Entity Prediction (DDEP). Experiments across retrieval and generative layout synthesis show that SBM learns compact room embeddings that reliably cluster by type and topology, enabling strong semantic retrieval. In DDEP mode, SBM produces functionally sound layouts, with fewer collisions and boundary violations and improved navigability.
Similar Papers
Innovative tokenisation of structured data for LLM training
Machine Learning (CS)
Turns messy data into neat lists for smart computers.
Layout Anything: One Transformer for Universal Room Layout Estimation
CV and Pattern Recognition
Draws accurate room layouts from pictures fast.
From Static Structures to Ensembles: Studying and Harnessing Protein Structure Tokenization
Machine Learning (CS)
Shows how proteins bend and move naturally.