LoopBench: Discovering Emergent Symmetry Breaking Strategies with LLM Swarms
By: Ali Parsaee , Yashar Talebirad , Csongor Szepesvári and more
Potential Business Impact:
Helps AI agents work together to solve problems.
Large Language Models (LLMs) are increasingly being utilized as autonomous agents, yet their ability to coordinate in distributed systems remains poorly understood. We introduce \textbf{LoopBench}, a benchmark to evaluate LLM reasoning in distributed symmetry breaking and meta-cognitive thinking. The benchmark focuses on coloring odd cycle graphs ($C_3, C_5, C_{11}$) with limited colors, where deterministic, non-communicating agents fail in infinite loops. A strategy passing mechanism is implemented as a form of consistent memory. We show that while standard LLMs and classical heuristics struggle, advanced reasoning models (e.g., O3) devise strategies to escape deadlocks. LoopBench allows the study of emergent distributed algorithms based on language-based reasoning, offering a testbed for collective intelligence.
Similar Papers
Benchmarking LLMs' Swarm intelligence
Multiagent Systems
Tests if AI can work together like a swarm.
LLM CHESS: Benchmarking Reasoning and Instruction-Following in LLMs through Chess
Artificial Intelligence
Tests how well AI plays and understands chess.
Multi-Mission Tool Bench: Assessing the Robustness of LLM based Agents through Related and Dynamic Missions
Artificial Intelligence
Tests AI that handles many jobs at once.