Score: 2

LayoutCoT: Unleashing the Deep Reasoning Potential of Large Language Models for Layout Generation

Published: April 15, 2025 | arXiv ID: 2504.10829v2

By: Hengyu Shi , Junhao Su , Junfeng Luo and more

BigTech Affiliations: Meituan

Potential Business Impact:

Makes computer designs look better automatically.

Business Areas:

Natural Language Processing Artificial Intelligence, Data and Analytics, Software

Conditional layout generation aims to automatically generate visually appealing and semantically coherent layouts from user-defined constraints. While recent methods based on generative models have shown promising results, they typically require substantial amounts of training data or extensive fine-tuning, limiting their versatility and practical applicability. Alternatively, some training-free approaches leveraging in-context learning with Large Language Models (LLMs) have emerged, but they often suffer from limited reasoning capabilities and overly simplistic ranking mechanisms, which restrict their ability to generate consistently high-quality layouts. To this end, we propose LayoutCoT, a novel approach that leverages the reasoning capabilities of LLMs through a combination of Retrieval-Augmented Generation (RAG) and Chain-of-Thought (CoT) techniques. Specifically, LayoutCoT transforms layout representations into a standardized serialized format suitable for processing by LLMs. A Layout-aware RAG is used to facilitate effective retrieval and generate a coarse layout by LLMs. This preliminary layout, together with the selected exemplars, is then fed into a specially designed CoT reasoning module for iterative refinement, significantly enhancing both semantic coherence and visual quality. We conduct extensive experiments on five public datasets spanning three conditional layout generation tasks. Experimental results demonstrate that LayoutCoT achieves state-of-the-art performance without requiring training or fine-tuning. Notably, our CoT reasoning module enables standard LLMs, even those without explicit deep reasoning abilities, to outperform specialized deep-reasoning models such as deepseek-R1, highlighting the potential of our approach in unleashing the deep reasoning capabilities of LLMs for layout generation tasks.

CoT-RAG: Integrating Chain of Thought and Retrieval-Augmented Generation to Enhance Reasoning in Large Language Models

Computation and Language

Makes AI think better and more reliably.

18 Apr 2025 2

91%

Reasoning Beyond Language: A Comprehensive Survey on Latent Chain-of-Thought Reasoning

Computation and Language

Lets computers think faster without words.

22 May 2025 1

90%

From Perception to Reasoning: Deep Thinking Empowers Multimodal Large Language Models

Computation and Language

Helps AI "think step-by-step" to solve harder problems.

17 Nov 2025 0

View PDF Login to Bookmark

Country of Origin

🇨🇳 China

Page Count

13 pages

LayoutCoT: Unleashing the Deep Reasoning Potential of Large Language Models for Layout Generation

Makes computer designs look better automatically.

Technical Abstract

CoT-RAG: Integrating Chain of Thought and Retrieval-Augmented Generation to Enhance Reasoning in Large Language Models

Reasoning Beyond Language: A Comprehensive Survey on Latent Chain-of-Thought Reasoning

From Perception to Reasoning: Deep Thinking Empowers Multimodal Large Language Models