Automated Thematic Analyses Using LLMs: Xylazine Wound Management Social Media Chatter Use Case
By: JaMor Hairston , Ritvik Ranjan , Sahithi Lakamana and more
Potential Business Impact:
Computers find patterns in online talks.
Background Large language models (LLMs) face challenges in inductive thematic analysis, a task requiring deep interpretive and domain-specific expertise. We evaluated the feasibility of using LLMs to replicate expert-driven thematic analysis of social media data. Methods Using two temporally non-intersecting Reddit datasets on xylazine (n=286 and n=686, for model optimization and validation, respectively) with twelve expert-derived themes, we evaluated five LLMs against expert coding. We modeled the task as a series of binary classifications, rather than a single, multi-label classification, employing zero-, single-, and few-shot prompting strategies and measuring performance via accuracy, precision, recall, and F1-score. Results On the validation set, GPT-4o with two-shot prompting performed best (accuracy: 90.9%; F1-score: 0.71). For high-prevalence themes, model-derived thematic distributions closely mirrored expert classifications (e.g., xylazine use: 13.6% vs. 17.8%; MOUD use: 16.5% vs. 17.8%). Conclusions Our findings suggest that few-shot LLM-based approaches can automate thematic analyses, offering a scalable supplement for qualitative research. Keywords: thematic analysis, large language models, natural language processing, qualitative analysis, social media, prompt engineering, public health
Similar Papers
Large Language Models in Thematic Analysis: Prompt Engineering, Evaluation, and Guidelines for Qualitative Software Engineering Research
Software Engineering
Helps computers find patterns in people's words.
AI Coding with Few-Shot Prompting for Thematic Analysis
Computation and Language
Lets computers sort through lots of text faster.
Position: Thematic Analysis of Unstructured Clinical Transcripts with Large Language Models
Computation and Language
Helps doctors understand patient stories faster.