Evolutionary Discovery of Heuristic Policies for Traffic Signal Control
By: Ruibing Wang , Shuhan Guo , Zeen Li and more
Potential Business Impact:
Teaches traffic lights to be smarter and faster.
Traffic Signal Control (TSC) involves a challenging trade-off: classic heuristics are efficient but oversimplified, while Deep Reinforcement Learning (DRL) achieves high performance yet suffers from poor generalization and opaque policies. Online Large Language Models (LLMs) provide general reasoning but incur high latency and lack environment-specific optimization. To address these issues, we propose Temporal Policy Evolution for Traffic (\textbf{\method{}}), which uses LLMs as an evolution engine to derive specialized heuristic policies. The framework introduces two key modules: (1) Structured State Abstraction (SSA), converting high-dimensional traffic data into temporal-logical facts for reasoning; and (2) Credit Assignment Feedback (CAF), tracing flawed micro-decisions to poor macro-outcomes for targeted critique. Operating entirely at the prompt level without training, \method{} yields lightweight, robust policies optimized for specific traffic environments, outperforming both heuristics and online LLM actors.
Similar Papers
Traffic-R1: Reinforced LLMs Bring Human-Like Reasoning to Traffic Signal Control Systems
Artificial Intelligence
Cuts traffic queues with smart, adaptable lights
CoLLMLight: Cooperative Large Language Model Agents for Network-Wide Traffic Signal Control
Machine Learning (CS)
Makes traffic lights work together better.
Large-scale Regional Traffic Signal Control Based on Single-Agent Reinforcement Learning
Machine Learning (CS)
Makes traffic lights smarter to reduce jams.