Game-RL: Synthesizing Verifiable Game Tasks at Scale to Boost VLMs General Reasoning
By: Jingqi Tong , Jixin Tang , Hangcheng Li and more
Potential Business Impact:
Teaches computers to understand games and other things.
Real-world vision language reasoning scenarios often include diverse and complex tasks. However, vision language reinforcement learning has primarily focused on a narrow set of tasks (e.g. geometry or chart reasoning), limiting the improvement of Vision Language Models' (VLMs) general reasoning. Therefore, we propose a novel Code2Logic approach, using Large Language Models (LLMs) to synthesize verifiable game reasoning tasks at scale via adapting game code. Using the Code2Logic, we developed the GameQA dataset to train and evaluate VLMs. GameQA is verifiable and scalable, offers controllable difficulty gradation and is diverse with 30 games and 158 tasks. Then we apply Game-RL, which is simple reinforcement learning on GameQA. Surprisingly, despite training solely on game tasks, VLMs demonstrated out of domain generalization, specifically Qwen2.5-VL-7B improving performance by 2.33% across 7 diverse vision-language benchmarks. Our code, dataset and models are available at the GitHub repository.
Similar Papers
Play to Generalize: Learning to Reason Through Game Play
CV and Pattern Recognition
Teaches AI to think better by playing games.
SynthRL: Scaling Visual Reasoning with Verifiable Data Synthesis
Machine Learning (CS)
Teaches computers to solve harder math problems.
Are Large Vision Language Models Good Game Players?
CV and Pattern Recognition
Tests AI's smarts with fun games.