好色先生

好色先生

Explore the latest content from across our publications

Log In

Forgot Password?
Create New Account

Loading... please wait

Abstract Details

Evaluating the Accuracy of Large Language Models in Stroke Management
Cerebrovascular Disease and Interventional Neurology
P10 - Poster Session 10 (8:00 AM-9:00 AM)
4-020
We aim to evaluate the accuracy of large language artificial intelligence models (LLM-AI) in acute and inpatient stroke management.
Artificial intelligence is currently being investigated as a tool for medical decision making. Imaging analysis AI software is used in stroke analysis (e.g. Rapid AI, Brainomix, VizAI, Aidoc... etc) as an adjunct to medical decision making in acute stroke patients. LLM-AI (ChatGPT, DoximityGPT, OpenEvidence) have been used as part of research studies to help execute medical decisions. However, their accuracy in stroke diagnosis and management has not been systematically assessed.
We provided three different LLM (ChatGPT, DoximityGPT, OpenEvidence) 25 real-life  stroke scenarios (5 cases for each of the 5 stroke mechanisms defined by the TOAST criteria). Cases were collected from October 2024 to September 2025 and LLM prompts were collected from 9/20/25-10/1/25 using the most recent LLM models. Criteria like adherence to guidelines, identification of mechanism and treatment decisions were used to evaluate accuracy of LLM-AI in medical decision making. AI responses were compared to decisions provided by two neurologists. 
Accuracy in identifying stroke mechanism ranged between 60-76%, being highest for large artery atherosclerosis and lowest for embolic stroke of undetermined source. Accurate acute stroke management decision was lower than expected and LLM-AI answers were correct in about 44-68% with OpenEvidence providing the most accurate decision making (68%). Overall adherence to guidelines ranged from 60-74% with ChatGPT achieving the highest overall adherence (74%).
LLM-AI demonstrated moderate accuracy in stroke management. Future work should explore integration with clinical workflows as an adjunctive tool. 
Authors/Disclosures
Sahil S. Suvarna, DO
PRESENTER
Dr. Suvarna has nothing to disclose.
Karim Makhoul, Sr., MD Dr. Makhoul has nothing to disclose.
Ahmad Shamulzai Ahmad Shamulzai has nothing to disclose.
Angela Xia, MS Ms. Xia has nothing to disclose.
Gregory Kurgansky, DO Gregory Kurgansky, DO has nothing to disclose.
Richard Libman, MD, FAAN (Northwell Health) Dr. Libman has nothing to disclose.