Score: 0

Discovering Interpretable Programmatic Policies via Multimodal LLM-assisted Evolutionary Search

Published: August 7, 2025 | arXiv ID: 2508.05433v1

By: Qinglong Hu , Xialiang Tong , Mingxuan Yuan and more

Potential Business Impact:

Makes smart robots explain their actions clearly.

Interpretability and high performance are essential goals in designing control policies, particularly for safety-critical tasks. Deep reinforcement learning has greatly enhanced performance, yet its inherent lack of interpretability often undermines trust and hinders real-world deployment. This work addresses these dual challenges by introducing a novel approach for programmatic policy discovery, called Multimodal Large Language Model-assisted Evolutionary Search (MLES). MLES utilizes multimodal large language models as policy generators, combining them with evolutionary mechanisms for automatic policy optimization. It integrates visual feedback-driven behavior analysis within the policy generation process to identify failure patterns and facilitate targeted improvements, enhancing the efficiency of policy discovery and producing adaptable, human-aligned policies. Experimental results show that MLES achieves policy discovery capabilities and efficiency comparable to Proximal Policy Optimization (PPO) across two control tasks, while offering transparent control logic and traceable design processes. This paradigm overcomes the limitations of predefined domain-specific languages, facilitates knowledge transfer and reuse, and is scalable across various control tasks. MLES shows promise as a leading approach for the next generation of interpretable control policy discovery.

PolicyEvolve: Evolving Programmatic Policies by LLMs for multi-player games via Population-Based Training

Machine Learning (CS)

Makes game AI learn faster and play smarter.

7 Sep 2025 1

88%

Socialized Learning and Emergent Behaviors in Multi-Agent Systems based on Multimodal Large Language Models

Multiagent Systems

AI learns to work together by seeing and talking.

21 Oct 2025 0

88%

Data-Driven Discovery of Interpretable Kalman Filter Variants through Large Language Models and Genetic Programming

Neural and Evolutionary Computing

Finds better math tools for science.

13 Aug 2025 0

View PDF Login to Bookmark

Country of Origin

🇭🇰 Hong Kong

Page Count

27 pages

Discovering Interpretable Programmatic Policies via Multimodal LLM-assisted Evolutionary Search

Makes smart robots explain their actions clearly.

Technical Abstract

PolicyEvolve: Evolving Programmatic Policies by LLMs for multi-player games via Population-Based Training

Socialized Learning and Emergent Behaviors in Multi-Agent Systems based on Multimodal Large Language Models

Data-Driven Discovery of Interpretable Kalman Filter Variants through Large Language Models and Genetic Programming