Score: 0

Symbolic Graphics Programming with Large Language Models

Published: September 5, 2025 | arXiv ID: 2509.05208v1

By: Yamei Chen , Haoquan Zhang , Yangyi Huang and more

Potential Business Impact:

Computers draw pictures from words.

Business Areas:

Natural Language Processing Artificial Intelligence, Data and Analytics, Software

Large language models (LLMs) excel at program synthesis, yet their ability to produce symbolic graphics programs (SGPs) that render into precise visual content remains underexplored. We study symbolic graphics programming, where the goal is to generate an SGP from a natural-language description. This task also serves as a lens into how LLMs understand the visual world by prompting them to generate images rendered from SGPs. Among various SGPs, our paper sticks to scalable vector graphics (SVGs). We begin by examining the extent to which LLMs can generate SGPs. To this end, we introduce SGP-GenBench, a comprehensive benchmark covering object fidelity, scene fidelity, and compositionality (attribute binding, spatial relations, numeracy). On SGP-GenBench, we discover that frontier proprietary models substantially outperform open-source models, and performance correlates well with general coding capabilities. Motivated by this gap, we aim to improve LLMs' ability to generate SGPs. We propose a reinforcement learning (RL) with verifiable rewards approach, where a format-validity gate ensures renderable SVG, and a cross-modal reward aligns text and the rendered image via strong vision encoders (e.g., SigLIP for text-image and DINO for image-image). Applied to Qwen-2.5-7B, our method substantially improves SVG generation quality and semantics, achieving performance on par with frontier systems. We further analyze training dynamics, showing that RL induces (i) finer decomposition of objects into controllable primitives and (ii) contextual details that improve scene coherence. Our results demonstrate that symbolic graphics programming offers a precise and interpretable lens on cross-modal grounding.

SVGen: Interpretable Vector Graphics Generation with Large Language Models

Machine Learning (CS)

Turns words into perfect computer drawings.

6 Aug 2025 1

89%

Bridging Vision Language Models and Symbolic Grounding for Video Question Answering

CV and Pattern Recognition

Helps computers understand videos better by seeing relationships.

15 Sep 2025 0

89%

Leveraging Large Language Models For Scalable Vector Graphics Processing: A Review

CV and Pattern Recognition

Lets computers create and fix drawings perfectly.

6 Mar 2025 0

View PDF Login to Bookmark

Page Count

32 pages

Symbolic Graphics Programming with Large Language Models

Computers draw pictures from words.

Technical Abstract

SVGen: Interpretable Vector Graphics Generation with Large Language Models

Bridging Vision Language Models and Symbolic Grounding for Video Question Answering

Leveraging Large Language Models For Scalable Vector Graphics Processing: A Review