Score: 1

Gaming the Arena: AI Model Evaluation and the Viral Capture of Attention

Published: December 17, 2025 | arXiv ID: 2512.15252v1

By: Sam Hind

Potential Business Impact:

AI models now battle each other to improve.

Business Areas:
Artificial Intelligence Artificial Intelligence, Data and Analytics, Science and Engineering, Software

Innovation in artificial intelligence (AI) has always been dependent on technological infrastructures, from code repositories to computing hardware. Yet industry -- rather than universities -- has become increasingly influential in shaping AI innovation. As generative forms of AI powered by large language models (LLMs) have driven the breakout of AI into the wider world, the AI community has sought to develop new methods for independently evaluating the performance of AI models. How best, in other words, to compare the performance of AI models against other AI models -- and how best to account for new models launched on nearly a daily basis? Building on recent work in media studies, STS, and computer science on benchmarking and the practices of AI evaluation, I examine the rise of so-called 'arenas' in which AI models are evaluated with reference to gladiatorial-style 'battles'. Through a technography of a leading user-driven AI model evaluation platform, LMArena, I consider five themes central to the emerging 'arena-ization' of AI innovation. Accordingly, I argue that the arena-ization is being powered by a 'viral' desire to capture attention both in, and outside of, the AI community, critical to the scaling and commercialization of AI products. In the discussion, I reflect on the implications of 'arena gaming', a phenomenon through which model developers hope to capture attention.

Country of Origin
🇬🇧 United Kingdom

Repos / Data Links

Page Count
28 pages

Category
Computer Science:
Computers and Society