A Multi-faceted Analysis of Cognitive Abilities: Evaluating Prompt Methods with Large Language Models on the CONSORT Checklist
By: Sohyeon Jeon, Hyung-Chul Lee
Potential Business Impact:
Helps doctors check if medical studies follow rules.
Despite the rapid expansion of Large Language Models (LLMs) in healthcare, the ability of these systems to assess clinical trial reporting according to CONSORT standards remains unclear, particularly with respect to their cognitive and reasoning strategies. This study applies a behavioral and metacognitive analytic approach with expert-validated data, systematically comparing two representative LLMs under three prompt conditions. Clear differences emerged in how the models approached various CONSORT items, and prompt types, including shifts in reasoning style, explicit uncertainty, and alternative interpretations shaped response patterns. Our results highlight the current limitations of these systems in clinical compliance automation and underscore the importance of understanding their cognitive adaptations and strategic behavior in developing more explainable and reliable medical AI.
Similar Papers
Evaluating the Ability of Large Language Models to Identify Adherence to CONSORT Reporting Guidelines in Randomized Controlled Trials: A Methodological Evaluation Study
Computation and Language
Helps check if medical study reports are complete.
Evaluation of Clinical Trials Reporting Quality using Large Language Models
Computation and Language
Helps doctors check if medical studies are honest.
Asking the Right Questions: Benchmarking Large Language Models in the Development of Clinical Consultation Templates
Computation and Language
Helps doctors write patient notes faster.