A case study on using a large language model to analyze continuous glucose monitoring data

Abstract Continuous glucose monitors (CGM) provide valuable insights about glycemic control that aid in diabetes management. However, interpreting metrics and charts and synthesizing them into linguistic summaries is often non-trivial for patients and providers. The advent of large language models (...

Full description

Saved in:
Bibliographic Details
Main Authors: Elizabeth Healey, Amelia Li Min Tan, Kristen L. Flint, Jessica L. Ruiz, Isaac Kohane
Format: Article
Language:English
Published: Nature Portfolio 2025-01-01
Series:Scientific Reports
Online Access:https://doi.org/10.1038/s41598-024-84003-0
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1841544698451197952
author Elizabeth Healey
Amelia Li Min Tan
Kristen L. Flint
Jessica L. Ruiz
Isaac Kohane
author_facet Elizabeth Healey
Amelia Li Min Tan
Kristen L. Flint
Jessica L. Ruiz
Isaac Kohane
author_sort Elizabeth Healey
collection DOAJ
description Abstract Continuous glucose monitors (CGM) provide valuable insights about glycemic control that aid in diabetes management. However, interpreting metrics and charts and synthesizing them into linguistic summaries is often non-trivial for patients and providers. The advent of large language models (LLMs) has enabled real-time text generation and summarization of medical data. The objective of this study was to assess the strengths and limitations of using an LLM to analyze raw CGM data and produce summaries of 14 days of data for patients with type 1 diabetes. We first evaluated the ability of GPT-4 to compute quantitative metrics specific to diabetes found in an Ambulatory Glucose Profile (AGP). Then, using two independent clinician graders, we evaluated the accuracy, completeness, safety, and suitability of qualitative descriptions produced by GPT-4 across five different CGM analysis tasks. GPT-4 performed 9 out of the 10 quantitative metrics tasks with perfect accuracy across all 10 cases. The clinician-evaluated CGM analysis tasks had good performance across measures of accuracy [lowest task mean score 8/10, highest task mean score 10/10], completeness [lowest task mean score 7.5/10, highest task mean score 10/10], and safety [lowest task mean score 9.5/10, highest task mean score 10/10]. Our work serves as a preliminary study on how generative language models can be integrated into diabetes care through data summarization and, more broadly, the potential to leverage LLMs for streamlined medical time series analysis.
format Article
id doaj-art-c8ffd82dc27d4febbd4e9bcdc49845cd
institution Kabale University
issn 2045-2322
language English
publishDate 2025-01-01
publisher Nature Portfolio
record_format Article
series Scientific Reports
spelling doaj-art-c8ffd82dc27d4febbd4e9bcdc49845cd2025-01-12T12:21:32ZengNature PortfolioScientific Reports2045-23222025-01-011511710.1038/s41598-024-84003-0A case study on using a large language model to analyze continuous glucose monitoring dataElizabeth Healey0Amelia Li Min Tan1Kristen L. Flint2Jessica L. Ruiz3Isaac Kohane4Program in Health Sciences and Technology, Massachusetts Institute of Technology Department of Biomedical Informatics, Harvard Medical School Diabetes Research Center, Massachusetts General Hospital Division of Endocrinology, Boston Children’s Hospital Department of Biomedical Informatics, Harvard Medical SchoolAbstract Continuous glucose monitors (CGM) provide valuable insights about glycemic control that aid in diabetes management. However, interpreting metrics and charts and synthesizing them into linguistic summaries is often non-trivial for patients and providers. The advent of large language models (LLMs) has enabled real-time text generation and summarization of medical data. The objective of this study was to assess the strengths and limitations of using an LLM to analyze raw CGM data and produce summaries of 14 days of data for patients with type 1 diabetes. We first evaluated the ability of GPT-4 to compute quantitative metrics specific to diabetes found in an Ambulatory Glucose Profile (AGP). Then, using two independent clinician graders, we evaluated the accuracy, completeness, safety, and suitability of qualitative descriptions produced by GPT-4 across five different CGM analysis tasks. GPT-4 performed 9 out of the 10 quantitative metrics tasks with perfect accuracy across all 10 cases. The clinician-evaluated CGM analysis tasks had good performance across measures of accuracy [lowest task mean score 8/10, highest task mean score 10/10], completeness [lowest task mean score 7.5/10, highest task mean score 10/10], and safety [lowest task mean score 9.5/10, highest task mean score 10/10]. Our work serves as a preliminary study on how generative language models can be integrated into diabetes care through data summarization and, more broadly, the potential to leverage LLMs for streamlined medical time series analysis.https://doi.org/10.1038/s41598-024-84003-0
spellingShingle Elizabeth Healey
Amelia Li Min Tan
Kristen L. Flint
Jessica L. Ruiz
Isaac Kohane
A case study on using a large language model to analyze continuous glucose monitoring data
Scientific Reports
title A case study on using a large language model to analyze continuous glucose monitoring data
title_full A case study on using a large language model to analyze continuous glucose monitoring data
title_fullStr A case study on using a large language model to analyze continuous glucose monitoring data
title_full_unstemmed A case study on using a large language model to analyze continuous glucose monitoring data
title_short A case study on using a large language model to analyze continuous glucose monitoring data
title_sort case study on using a large language model to analyze continuous glucose monitoring data
url https://doi.org/10.1038/s41598-024-84003-0
work_keys_str_mv AT elizabethhealey acasestudyonusingalargelanguagemodeltoanalyzecontinuousglucosemonitoringdata
AT amelialimintan acasestudyonusingalargelanguagemodeltoanalyzecontinuousglucosemonitoringdata
AT kristenlflint acasestudyonusingalargelanguagemodeltoanalyzecontinuousglucosemonitoringdata
AT jessicalruiz acasestudyonusingalargelanguagemodeltoanalyzecontinuousglucosemonitoringdata
AT isaackohane acasestudyonusingalargelanguagemodeltoanalyzecontinuousglucosemonitoringdata
AT elizabethhealey casestudyonusingalargelanguagemodeltoanalyzecontinuousglucosemonitoringdata
AT amelialimintan casestudyonusingalargelanguagemodeltoanalyzecontinuousglucosemonitoringdata
AT kristenlflint casestudyonusingalargelanguagemodeltoanalyzecontinuousglucosemonitoringdata
AT jessicalruiz casestudyonusingalargelanguagemodeltoanalyzecontinuousglucosemonitoringdata
AT isaackohane casestudyonusingalargelanguagemodeltoanalyzecontinuousglucosemonitoringdata