好色先生

好色先生

Explore the latest content from across our publications

Log In

Forgot Password?
Create New Account

Loading... please wait

Abstract Details

A Machine Learning Approach for Identifying People with Neuroinfectious Diseases in Electronic Health Records
Infectious Disease
P2 - Poster Session 2 (11:45 AM-12:45 PM)
13-001

To develop an Automated Electronic Health Record (EHR) Phenotyping (AEP) model using machine learning (ML), to enhance neuroinfectious disease (NID) cohort selection.

Identifying NID cases using manual chart review or billing codes is time-consuming and suboptimal. EHR-based ML models for NID identification have yet to be explored.

Clinical notes from patients with a lumbar puncture were obtained using the EHR of an academic hospital network, with half having NID-related ICD-9/10 codes. Six physicians with NID expertise manually reviewed 500 charts each, to generate the ground truth, where uncertain diagnoses were discarded. Regular expressions were developed to match NID keywords, and extracted texts were converted into bag-of-words representations using 1,3 n-grams. Notes were randomly split into training (80%), and hold-out testing (20%) sets. Feature selection was performed using a variance threshold of 0.075. An extreme gradient boosting (XGBoost) model classified NID cases, and performance was assessed on the testing set using the Area Under the Receiver Operating Curve (AUROC) and the Precision-Recall Curve (AUPRC); 95% confidence intervals (CI) were obtained using bootstrapping.

This cohort included 2,469 patients, with 2,956 notes from January 2010 to September 2023. The mean age (standard deviation) was 58.2 years (19.6), 55% were women, 77% White, and 83% non-Hispanic. A total of 15.9% (466/2,469) were confirmed NID cases, and only 31.0% (452/1,460) of notes with NID-related ICD codes were deemed positive for a NID. Of the initial 289,377 features, 612 were selected, with the most significant features being “meningitis,” “encephalitis,” “CSF,” “viral,” and “neurosyphilis.” The XGBoost model classified NID cases with an AUROC of 0.95 (95% CI: 0.92-0.97) and an AUPRC of 0.80 (95% CI: 0.71-0.88).

Our ML-driven AEP model accurately identifies NID cases using clinical notes, thereby enhancing efficiency in NID research and cohort generation. Future studies incorporating multiple EHRs are needed to ensure generalizability.

Authors/Disclosures
Arjun Singh, MBBS (MGH)
PRESENTER
Dr. Singh has nothing to disclose.
Marta Fernandes (Massachusetts General Hospital) Marta Fernandes has nothing to disclose.
Haoqi Sun, PhD (Massachusetts General Hospital) Dr. Sun has nothing to disclose.
Carson M. Quinn, MD (Mass General Brigham) Dr. Quinn has nothing to disclose.
George K. Harrold, MD (Brigham and Women's Hospital) Dr. Harrold has nothing to disclose.
Rebecca L. Gillani, MD (Massachusetts General Hospital) The institution of Dr. Gillani has received research support from The Phyllis and Jerome Lyle Rappaport Foundation. The institution of Dr. Gillani has received research support from McCourt Foundation . The institution of Dr. Gillani has received research support from Roche.
Sarah Turbett (MGH) An immediate family member of Sarah Turbett has received personal compensation for serving as an employee of Analysis Group Inc. Sarah Turbett has received personal compensation in the range of $0-$499 for serving as a Consultant for CarbX. Sarah Turbett has received research support from SeLux Diagnostics.
Sudeshna Das (MGH) No disclosure on file
M. B. Westover, MD, PhD (MGH) Dr. Westover has received personal compensation in the range of $50,000-$99,999 for serving as a Consultant for Beacon Biosignals. Dr. Westover has stock in Beacon Biosignals. The institution of Dr. Westover has received research support from NIH. Dr. Westover has received publishing royalties from a publication relating to health care. Dr. Westover has a non-compensated relationship as a cofounder with Beacon Biosignals that is relevant to AAN interests or activities.
Shibani S. Mukerji, MD, PhD (Massachusetts General Hospital) Dr. Mukerji has received personal compensation in the range of $500-$4,999 for serving as an Editor, Associate Editor, or Editorial Advisory Board Member for Dynamed. Dr. Mukerji has or had stock in Gilead Science.Dr. Mukerji has or had stock in Ranpack.Dr. Mukerji has or had stock in Snowflake. An immediate family member of Dr. Mukerji has or had stock in Amgen. The institution of Dr. Mukerji has received research support from NIH. The institution of Dr. Mukerji has received research support from Massachusetts General Hospital.