Score: 0

Large Language Models' Varying Accuracy in Recognizing Risk-Promoting and Health-Supporting Sentiments in Public Health Discourse: The Cases of HPV Vaccination and Heated Tobacco Products

Published: July 6, 2025 | arXiv ID: 2507.04364v1

By: Soojong Kim, Kwanho Kim, Hye Min Kim

Potential Business Impact:

Helps understand health opinions online better.

Machine learning methods are increasingly applied to analyze health-related public discourse based on large-scale data, but questions remain regarding their ability to accurately detect different types of health sentiments. Especially, Large Language Models (LLMs) have gained attention as a powerful technology, yet their accuracy and feasibility in capturing different opinions and perspectives on health issues are largely unexplored. Thus, this research examines how accurate the three prominent LLMs (GPT, Gemini, and LLAMA) are in detecting risk-promoting versus health-supporting sentiments across two critical public health topics: Human Papillomavirus (HPV) vaccination and heated tobacco products (HTPs). Drawing on data from Facebook and Twitter, we curated multiple sets of messages supporting or opposing recommended health behaviors, supplemented with human annotations as the gold standard for sentiment classification. The findings indicate that all three LLMs generally demonstrate substantial accuracy in classifying risk-promoting and health-supporting sentiments, although notable discrepancies emerge by platform, health issue, and model type. Specifically, models often show higher accuracy for risk-promoting sentiment on Facebook, whereas health-supporting messages on Twitter are more accurately detected. An additional analysis also shows the challenges LLMs face in reliably detecting neutral messages. These results highlight the importance of carefully selecting and validating language models for public health analyses, particularly given potential biases in training data that may lead LLMs to overestimate or underestimate the prevalence of certain perspectives.

An Empirical Analysis of LLMs for Countering Misinformation

Computation and Language

Helps computers spot fake news, but needs improvement.

28 Feb 2025 0

90%

Benchmarking Open-Source Large Language Models on Healthcare Text Classification Tasks

Computation and Language

Helps computers find health info from text.

19 Mar 2025 1

90%

Dr. GPT Will See You Now, but Should It? Exploring the Benefits and Harms of Large Language Models in Medical Diagnosis using Crowdsourced Clinical Cases

Computers and Society

AI helps answer everyday health questions accurately.

13 Jun 2025 0

View PDF Login to Bookmark

Country of Origin

🇺🇸 United States

Page Count

34 pages

Large Language Models' Varying Accuracy in Recognizing Risk-Promoting and Health-Supporting Sentiments in Public Health Discourse: The Cases of HPV Vaccination and Heated Tobacco Products

Helps understand health opinions online better.

Technical Abstract

An Empirical Analysis of LLMs for Countering Misinformation

Benchmarking Open-Source Large Language Models on Healthcare Text Classification Tasks

Dr. GPT Will See You Now, but Should It? Exploring the Benefits and Harms of Large Language Models in Medical Diagnosis using Crowdsourced Clinical Cases