AGDC: Autoregressive Generation of Variable-Length Sequences with Joint Discrete and Continuous Spaces
By: Yeonsang Shin , Insoo Kim , Bongkeun Kim and more
Potential Business Impact:
Makes computer designs super accurate and detailed.
Transformer-based autoregressive models excel in data generation but are inherently constrained by their reliance on discretized tokens, which limits their ability to represent continuous values with high precision. We analyze the scalability limitations of existing discretization-based approaches for generating hybrid discrete-continuous sequences, particularly in high-precision domains such as semiconductor circuit designs, where precision loss can lead to functional failure. To address the challenge, we propose AGDC, a novel unified framework that jointly models discrete and continuous values for variable-length sequences. AGDC employs a hybrid approach that combines categorical prediction for discrete values with diffusion-based modeling for continuous values, incorporating two key technical components: an end-of-sequence (EOS) logit adjustment mechanism that uses an MLP to dynamically adjust EOS token logits based on sequence context, and a length regularization term integrated into the loss function. Additionally, we present ContLayNet, a large-scale benchmark comprising 334K high-precision semiconductor layout samples with specialized evaluation metrics that capture functional correctness where precision errors significantly impact performance. Experiments on semiconductor layouts (ContLayNet), graphic layouts, and SVGs demonstrate AGDC's superior performance in generating high-fidelity hybrid vector representations compared to discretization-based and fixed-schema baselines, achieving scalable high-precision generation across diverse domains.
Similar Papers
D2C: Unlocking the Potential of Continuous Autoregressive Image Generation with Discrete Tokens
CV and Pattern Recognition
Makes computers draw better pictures using two kinds of clues.
Rethinking Discrete Tokens: Treating Them as Conditions for Continuous Autoregressive Image Synthesis
CV and Pattern Recognition
Makes AI create clearer, more realistic pictures.
LGDC: Latent Graph Diffusion via Spectrum-Preserving Coarsening
Machine Learning (CS)
Creates better computer-made pictures of connections.