A large language model based pipeline for extracting information from patient complaint and anamnesis in clinical notes for severity assessment

Abstract Identifying patients with critical illness in emergency departments (EDs) is an ongoing challenge, partly due to the limited information available at the time of admission. The clinical notes in patient records have already received attention for the value of improving prediction. Recent la...

Full description

Saved in:

Bibliographic Details
Main Authors:	Hui Gao, Kaipeng Wang, Yuan Yuan, Yueguo Wang, Qingyuan Liu, Yulan Wang, Jian Sun, Wenwen Wang, Huanli Wang, Shusheng Zhou, Kui Jin, Mengping Zhang, Yinglei Lai
Format:	Article
Language:	English
Published:	Nature Portfolio 2025-07-01
Series:	Scientific Reports
Subjects:	Large language model Emergency department Triage In-context learning Retrieval-augmented generation
Online Access:	https://doi.org/10.1038/s41598-025-07649-4
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	Abstract Identifying patients with critical illness in emergency departments (EDs) is an ongoing challenge, partly due to the limited information available at the time of admission. The clinical notes in patient records have already received attention for the value of improving prediction. Recent large language models (LLMs) have demonstrated their promising performance. However, the utilization of LLMs for analyzing clinical notes has not been extensively investigated. To improve the severity assessment of illness and the prediction of triage level, we developed a pipeline for utilizing LLMs (e.g. ChatGLM-2, GLM-4 and Alpaca-2) to extract information from patient complaint and anamnesis in clinical notes. In this pipeline, a LLM is supplied with the text input including complaint and anamnesis of a patient, where the input is further constructed by a prompt template, in-context learning (ICL), and retrieval-augmented generation (RAG). Then a severity score is extracted from the LLM, which is further integrated into a predictive model for improving its performance. We demonstrated the effectiveness of our pipeline based on the patient records derived from Chinese Emergency Triage, Assessment, and Treatment (CETAT) database. The extracted score were be incorporated into logistic regression as a predictor. At early stage, as vital signs were typically not yet measured, the predictive value of patient complaint and anamnesis was illustrated (evidenced by an improvement in AUC-ROC from 0.746 to 0.802). At later stage, vital signs became available, the enhancements in prediction attributable to the score were weaker, but still was observed with statistical significance in most cases. The recent LLMs are capable of extracting valuable information from clinical notes for identifying critical illness. The effectiveness has been illustrated in our study. It is still necessary to develop more efficient methods based on LLMs in order to achieve better performance.
ISSN:	2045-2322

A large language model based pipeline for extracting information from patient complaint and anamnesis in clinical notes for severity assessment

Similar Items