Score: 1

Gaming the Arena: AI Model Evaluation and the Viral Capture of Attention

Published: December 17, 2025 | arXiv ID: 2512.15252v1

By: Sam Hind

Potential Business Impact:

AI models now battle each other to improve.

Business Areas:

Artificial Intelligence Artificial Intelligence, Data and Analytics, Science and Engineering, Software

Innovation in artificial intelligence (AI) has always been dependent on technological infrastructures, from code repositories to computing hardware. Yet industry -- rather than universities -- has become increasingly influential in shaping AI innovation. As generative forms of AI powered by large language models (LLMs) have driven the breakout of AI into the wider world, the AI community has sought to develop new methods for independently evaluating the performance of AI models. How best, in other words, to compare the performance of AI models against other AI models -- and how best to account for new models launched on nearly a daily basis? Building on recent work in media studies, STS, and computer science on benchmarking and the practices of AI evaluation, I examine the rise of so-called 'arenas' in which AI models are evaluated with reference to gladiatorial-style 'battles'. Through a technography of a leading user-driven AI model evaluation platform, LMArena, I consider five themes central to the emerging 'arena-ization' of AI innovation. Accordingly, I argue that the arena-ization is being powered by a 'viral' desire to capture attention both in, and outside of, the AI community, critical to the scaling and commercialization of AI products. In the discussion, I reflect on the implications of 'arena gaming', a phenomenon through which model developers hope to capture attention.

Inclusion Arena: An Open Platform for Evaluating Large Foundation Models with Real-World Apps

Artificial Intelligence

Tests AI by seeing how people like its answers.

15 Aug 2025 0

89%

Inclusion Arena: An Open Platform for Evaluating Large Foundation Models with Real-World Apps

Artificial Intelligence

Tests AI by seeing how people like its answers.

15 Aug 2025 0

88%

Board Game Arena: A Framework and Benchmark for Assessing Large Language Models via Strategic Play

Artificial Intelligence

Tests AI smarts with board games.

5 Aug 2025 0

View PDF Login to Bookmark

Country of Origin

🇬🇧 United Kingdom

Repos / Data Links

github.com

Page Count

28 pages

Gaming the Arena: AI Model Evaluation and the Viral Capture of Attention

AI models now battle each other to improve.

Technical Abstract

Inclusion Arena: An Open Platform for Evaluating Large Foundation Models with Real-World Apps

Inclusion Arena: An Open Platform for Evaluating Large Foundation Models with Real-World Apps

Board Game Arena: A Framework and Benchmark for Assessing Large Language Models via Strategic Play