Standardized patient profile review using large language models for case adjudication in observational research
Abstract Using administrative claims and electronic health records for observational studies is common but challenging due to data limitations. Researchers rely on phenotype algorithms, requiring labor-intensive chart reviews for validation. This study investigates whether case adjudication using th...
Saved in:
Main Authors: | , , , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Nature Portfolio
2025-01-01
|
Series: | npj Digital Medicine |
Online Access: | https://doi.org/10.1038/s41746-025-01433-4 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
_version_ | 1841544260834295808 |
---|---|
author | Martijn J. Schuemie Anna Ostropolets Aleh Zhuk Uladzislau Korsik Seung In Seo Marc A. Suchard George Hripcsak Patrick B. Ryan |
author_facet | Martijn J. Schuemie Anna Ostropolets Aleh Zhuk Uladzislau Korsik Seung In Seo Marc A. Suchard George Hripcsak Patrick B. Ryan |
author_sort | Martijn J. Schuemie |
collection | DOAJ |
description | Abstract Using administrative claims and electronic health records for observational studies is common but challenging due to data limitations. Researchers rely on phenotype algorithms, requiring labor-intensive chart reviews for validation. This study investigates whether case adjudication using the previously introduced Knowledge-Enhanced Electronic Profile Review (KEEPER) system with large language models (LLMs) is feasible and could serve as a viable alternative to manual chart review. The task involves adjudicating cases identified by a phenotype algorithm, with KEEPER extracting predefined findings such as symptoms, comorbidities, and treatments from structured data. LLMs then evaluate KEEPER outputs to determine whether a patient truly qualifies as a case. We tested four LLMs including GPT-4, hosted locally to ensure privacy. Using zero-shot prompting and iterative prompt optimization, we found LLM performance, across ten diseases, varied by prompt and model, with sensitivities from 78 to 98% and specificities from 48 to 98%, indicating promise for automating phenotype evaluation. |
format | Article |
id | doaj-art-0b94fbd6be0943a0821308c85ad7715f |
institution | Kabale University |
issn | 2398-6352 |
language | English |
publishDate | 2025-01-01 |
publisher | Nature Portfolio |
record_format | Article |
series | npj Digital Medicine |
spelling | doaj-art-0b94fbd6be0943a0821308c85ad7715f2025-01-12T12:40:53ZengNature Portfolionpj Digital Medicine2398-63522025-01-01811710.1038/s41746-025-01433-4Standardized patient profile review using large language models for case adjudication in observational researchMartijn J. Schuemie0Anna Ostropolets1Aleh Zhuk2Uladzislau Korsik3Seung In Seo4Marc A. Suchard5George Hripcsak6Patrick B. Ryan7Observational Health Data Science and InformaticsObservational Health Data Science and InformaticsObservational Health Data Science and InformaticsObservational Health Data Science and InformaticsObservational Health Data Science and InformaticsObservational Health Data Science and InformaticsObservational Health Data Science and InformaticsObservational Health Data Science and InformaticsAbstract Using administrative claims and electronic health records for observational studies is common but challenging due to data limitations. Researchers rely on phenotype algorithms, requiring labor-intensive chart reviews for validation. This study investigates whether case adjudication using the previously introduced Knowledge-Enhanced Electronic Profile Review (KEEPER) system with large language models (LLMs) is feasible and could serve as a viable alternative to manual chart review. The task involves adjudicating cases identified by a phenotype algorithm, with KEEPER extracting predefined findings such as symptoms, comorbidities, and treatments from structured data. LLMs then evaluate KEEPER outputs to determine whether a patient truly qualifies as a case. We tested four LLMs including GPT-4, hosted locally to ensure privacy. Using zero-shot prompting and iterative prompt optimization, we found LLM performance, across ten diseases, varied by prompt and model, with sensitivities from 78 to 98% and specificities from 48 to 98%, indicating promise for automating phenotype evaluation.https://doi.org/10.1038/s41746-025-01433-4 |
spellingShingle | Martijn J. Schuemie Anna Ostropolets Aleh Zhuk Uladzislau Korsik Seung In Seo Marc A. Suchard George Hripcsak Patrick B. Ryan Standardized patient profile review using large language models for case adjudication in observational research npj Digital Medicine |
title | Standardized patient profile review using large language models for case adjudication in observational research |
title_full | Standardized patient profile review using large language models for case adjudication in observational research |
title_fullStr | Standardized patient profile review using large language models for case adjudication in observational research |
title_full_unstemmed | Standardized patient profile review using large language models for case adjudication in observational research |
title_short | Standardized patient profile review using large language models for case adjudication in observational research |
title_sort | standardized patient profile review using large language models for case adjudication in observational research |
url | https://doi.org/10.1038/s41746-025-01433-4 |
work_keys_str_mv | AT martijnjschuemie standardizedpatientprofilereviewusinglargelanguagemodelsforcaseadjudicationinobservationalresearch AT annaostropolets standardizedpatientprofilereviewusinglargelanguagemodelsforcaseadjudicationinobservationalresearch AT alehzhuk standardizedpatientprofilereviewusinglargelanguagemodelsforcaseadjudicationinobservationalresearch AT uladzislaukorsik standardizedpatientprofilereviewusinglargelanguagemodelsforcaseadjudicationinobservationalresearch AT seunginseo standardizedpatientprofilereviewusinglargelanguagemodelsforcaseadjudicationinobservationalresearch AT marcasuchard standardizedpatientprofilereviewusinglargelanguagemodelsforcaseadjudicationinobservationalresearch AT georgehripcsak standardizedpatientprofilereviewusinglargelanguagemodelsforcaseadjudicationinobservationalresearch AT patrickbryan standardizedpatientprofilereviewusinglargelanguagemodelsforcaseadjudicationinobservationalresearch |