Forgetful by Design? A Critical Audit of YouTube's Search API for Academic Research
By: Bernhard Rieder, Adrian Padilla, Oscar Coromina
Potential Business Impact:
YouTube searches miss many videos for research.
This paper critically audits the search endpoint of YouTube's Data API (v3), a common tool for academic research. Through systematic weekly searches over six months using eleven queries, we identify major limitations regarding completeness, representativeness, consistency, and bias. Our findings reveal substantial differences between ranking parameters like relevance and date in terms of video recall and precision, with relevance often retrieving numerous off-topic videos. We also find severe temporal decay, as the number of findable videos for a specific period dramatically decreases after just 20-60 days from the publication date, potentially hampering many different research designs. Furthermore, search results lack consistency, with identical queries yielding different video sets over time, compromising replicability. A case study on the European Parliament elections highlights how these issues impact research outcomes. While the paper offers several mitigation strategies, it concludes that the API's search function, potentially prioritizing "freshness" over comprehensive retrieval, is not adequate for robust academic research, especially concerning Digital Services Act requirements.
Similar Papers
On YouTube Search API Use in Research
Information Retrieval
YouTube search results change unpredictably.
TikTok's Research API: Problems Without Explanations
Computers and Society
TikTok's data sharing is incomplete for researchers.
A Keyframe-Based Approach for Auditing Bias in YouTube Shorts Recommendations
Social and Information Networks
Shows how YouTube videos change what you see.