好色先生

好色先生

Explore the latest content from across our publications

Log In

Forgot Password?
Create New Account

Loading... please wait

Abstract Details

Assessment of ChatGPT’s performance on Neurology Written Board Examination Questions
好色先生, Research, and Methodology
P5 - Poster Session 5 (5:30 PM-6:30 PM)
7-002
To evaluate the performance of ChatGPT in answering neurology board-styled questions
Artificial intelligence (AI) models like ChatGPT have gained prominence in various professional fields, including healthcare. To further study the possible utility of this novel tool in a healthcare setting, we evaluated the performance of ChatGPT in answering neurology board-styled questions.
Neurology board-style questions were accessed from Board Vitals, a commercial neurology question bank. ChatGPT (GPT4 via Microsoft Bing Chat) was provided the full question prompt and answer choices. First attempts and additional attempts of up to three tries were given to ChatGPT to select the correct answer. A total of 560 questions (14 blocks of 40 questions) were used, although any image-based questions were disregarded due to ChatGPT’s inability to process visual input. The AI answers were then compared to human user data provided by the question bank to gauge its performance.
Out of 509 eligible questions over 14 question blocks, ChatGPT correctly answered 335 questions (65.8%) on the first iteration and 383 (75.3%) over three iterations, translating to approximately the 26th and 50th percentile respectively. The highest performing subjects were Pain (100%), Epilepsy & Seizures (85%), and Genetic (82%) while the lowest performing subjects were Imaging/Diagnostic Studies (27%), Critical Care (41%), and Cranial Nerves (48%).

This study found that ChatGPT performed similarly to its human counterparts. The accuracy of the AI increased with subsequent question iterations and performance was within the expected range of neurology learners. Here we demonstrate ChatGPT’s potential in processing specialized medical information. Future studies would better define the scope to which AI would be able to integrate into medical decision making.

Authors/Disclosures
Tse Chiang Chen, MD (Tulane School of Medicine)
PRESENTER
Dr. Chen has nothing to disclose.
Evan Multala No disclosure on file
Patrick Kearns, MD (Tulane University School of Medicine) No disclosure on file
Arthur Wang, MD (Tulane Center for Clinical Neurosciences) Dr. Wang has nothing to disclose.