Addressing Semantic Variability in Clinical Outcome Reporting Using Large Language Models

<b>Background/Objectives</b>: Clinical trials frequently employ diverse terminologies and definitions to describe similar outcomes, leading to ambiguity and inconsistency in data interpretation. Addressing the variability in clinical outcome reports and integrating semantically similar o...

Full description

Saved in:
Bibliographic Details
Main Authors: Fatemeh Shah-Mohammadi, Joseph Finkelstein
Format: Article
Language:English
Published: MDPI AG 2024-10-01
Series:BioMedInformatics
Subjects:
Online Access:https://www.mdpi.com/2673-7426/4/4/116
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1846105672183185408
author Fatemeh Shah-Mohammadi
Joseph Finkelstein
author_facet Fatemeh Shah-Mohammadi
Joseph Finkelstein
author_sort Fatemeh Shah-Mohammadi
collection DOAJ
description <b>Background/Objectives</b>: Clinical trials frequently employ diverse terminologies and definitions to describe similar outcomes, leading to ambiguity and inconsistency in data interpretation. Addressing the variability in clinical outcome reports and integrating semantically similar outcomes is important in healthcare and clinical research. Variability in outcome reporting not only hinders the comparability of clinical trial results but also poses significant challenges in evidence synthesis, meta-analysis, and evidence-based decision-making. <b>Methods</b>: This study investigates variability reduction in outcome measures reporting using rule-based and large language-based models. It aims to mitigate the challenges associated with variability in outcome reporting by comparing these two models. The first approach, which is rule-based, will leverage well-known ontologies, and the second approach exploits sentence-bidirectional encoder representations from transformers (SBERT) to identify semantically similar outcomes along with Generative Pre-training Transformer (GPT) to refine the results. <b>Results</b>: The results show that the relatively low percentages of outcomes are linked to established rule-based ontologies. Analysis of outcomes by word count highlighted the absence of ontological linkage for three-word outcomes, which indicates potential gaps in semantic representation. <b>Conclusions</b>: Employing large language models (LLMs), this study demonstrates its ability to identify similar outcomes, even with more than three words, suggesting a crucial role in outcome harmonization efforts, potentially reducing redundancy and enhancing data interoperability.
format Article
id doaj-art-e8b3459bdabf4b5296d554146c8239a4
institution Kabale University
issn 2673-7426
language English
publishDate 2024-10-01
publisher MDPI AG
record_format Article
series BioMedInformatics
spelling doaj-art-e8b3459bdabf4b5296d554146c8239a42024-12-27T14:13:18ZengMDPI AGBioMedInformatics2673-74262024-10-01442173218510.3390/biomedinformatics4040116Addressing Semantic Variability in Clinical Outcome Reporting Using Large Language ModelsFatemeh Shah-Mohammadi0Joseph Finkelstein1Department of Biomedical Informatics, School of Medicine, University of Utah, Salt Lake City, UT 84108, USADepartment of Biomedical Informatics, School of Medicine, University of Utah, Salt Lake City, UT 84108, USA<b>Background/Objectives</b>: Clinical trials frequently employ diverse terminologies and definitions to describe similar outcomes, leading to ambiguity and inconsistency in data interpretation. Addressing the variability in clinical outcome reports and integrating semantically similar outcomes is important in healthcare and clinical research. Variability in outcome reporting not only hinders the comparability of clinical trial results but also poses significant challenges in evidence synthesis, meta-analysis, and evidence-based decision-making. <b>Methods</b>: This study investigates variability reduction in outcome measures reporting using rule-based and large language-based models. It aims to mitigate the challenges associated with variability in outcome reporting by comparing these two models. The first approach, which is rule-based, will leverage well-known ontologies, and the second approach exploits sentence-bidirectional encoder representations from transformers (SBERT) to identify semantically similar outcomes along with Generative Pre-training Transformer (GPT) to refine the results. <b>Results</b>: The results show that the relatively low percentages of outcomes are linked to established rule-based ontologies. Analysis of outcomes by word count highlighted the absence of ontological linkage for three-word outcomes, which indicates potential gaps in semantic representation. <b>Conclusions</b>: Employing large language models (LLMs), this study demonstrates its ability to identify similar outcomes, even with more than three words, suggesting a crucial role in outcome harmonization efforts, potentially reducing redundancy and enhancing data interoperability.https://www.mdpi.com/2673-7426/4/4/116semantic variabilityclinical outcomeontologylarge language model
spellingShingle Fatemeh Shah-Mohammadi
Joseph Finkelstein
Addressing Semantic Variability in Clinical Outcome Reporting Using Large Language Models
BioMedInformatics
semantic variability
clinical outcome
ontology
large language model
title Addressing Semantic Variability in Clinical Outcome Reporting Using Large Language Models
title_full Addressing Semantic Variability in Clinical Outcome Reporting Using Large Language Models
title_fullStr Addressing Semantic Variability in Clinical Outcome Reporting Using Large Language Models
title_full_unstemmed Addressing Semantic Variability in Clinical Outcome Reporting Using Large Language Models
title_short Addressing Semantic Variability in Clinical Outcome Reporting Using Large Language Models
title_sort addressing semantic variability in clinical outcome reporting using large language models
topic semantic variability
clinical outcome
ontology
large language model
url https://www.mdpi.com/2673-7426/4/4/116
work_keys_str_mv AT fatemehshahmohammadi addressingsemanticvariabilityinclinicaloutcomereportingusinglargelanguagemodels
AT josephfinkelstein addressingsemanticvariabilityinclinicaloutcomereportingusinglargelanguagemodels