Abstract Details

Title Evaluating the Accuracy of Artificial Intelligence (AI) Medical Scribe Software Utilization in Telestroke

Topic Cerebrovascular Disease and Interventional Neurology

Presentation(s) P4 - Poster Session 4 (8:00 AM-9:00 AM)

Poster/Presentation
Number 4-013

Objective To measure the performance of an Artificial Intelligence (AI) medical scribe by using Medical Concept Word Error Rate (MC-WER) in telestroke encounters.

Background AI scribes are increasingly utilized to reduce documentation burdens on physicians in the outpatient setting. Unfortunately, there is no consensus on the best metrics for evaluating the accuracy and risks of AI scribes. The current study evaluates the Medical Concept Word Error Rate (MC-WER) in assessing the performance of an AI scribe in automating documentation of physician-patient encounters in the hospital setting.

Design/Methods Transcripts from 25 telestroke patient encounters were de-identified then processed by an OpenAI LLM. The History of Present Illness (HPI) sections generated by the AI scribe vs provider were compared. Medical Concept Word Omission Rates and Medical Concept Word Addition Rates were calculated. Additionally, the likely diagnoses based on review of the AI scribe and provider-generated HPI’s were compared. The likely diagnosis of the provider HPI was the ground truth. HPI’s from the AI scribe that led to an incorrect diagnosis were compared with HPI’s with the correct diagnosis.

Results Five of the 25 AI scribe-generated HPI’s (20%) led to a different diagnosis compared to the provider-generated HPI for the corresponding case. AI scribe-generated HPI’s that resulted in an incorrect diagnosis had a significantly higher Medical Concept Omission Rate than HPI’s that led to the correct diagnosis (p=0.0036). There was no difference in the Medical Concept Addition Rate between groups.

Conclusions This study showed the feasibility of utilizing an AI scribe for acute telestroke encounters and provides a framework for how to assess the accuracy of the tool. The Medical Concept Omission Rate may be an important metric when evaluating the accuracy and risks of AI scribes. Further research is needed to validate this preliminary data and further develop the ideal AI scribe for telestroke encounters.

Authors/Disclosures
Lana Prieur PRESENTER	Miss Prieur has nothing to disclose.
Mark McDonald (TeleSpecialists)	Dr. McDonald has nothing to disclose.
Theresa B. Sevilis, DO, FAAN (Telespecialists, LLC)	Dr. Sevilis has received personal compensation in the range of $50,000-$99,999 for serving as an Expert Witness for Sevilis Neurology Consulting, LLV. Dr. Sevilis has or had stock in Moderna.

��ɫ��

��ɫ��