Addressing Semantic Variability in Clinical Outcome Reporting Using Large Language Models
<b>Background/Objectives</b>: Clinical trials frequently employ diverse terminologies and definitions to describe similar outcomes, leading to ambiguity and inconsistency in data interpretation. Addressing the variability in clinical outcome reports and integrating semantically similar o...
Saved in:
| Main Authors: | , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
MDPI AG
2024-10-01
|
| Series: | BioMedInformatics |
| Subjects: | |
| Online Access: | https://www.mdpi.com/2673-7426/4/4/116 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1846105672183185408 |
|---|---|
| author | Fatemeh Shah-Mohammadi Joseph Finkelstein |
| author_facet | Fatemeh Shah-Mohammadi Joseph Finkelstein |
| author_sort | Fatemeh Shah-Mohammadi |
| collection | DOAJ |
| description | <b>Background/Objectives</b>: Clinical trials frequently employ diverse terminologies and definitions to describe similar outcomes, leading to ambiguity and inconsistency in data interpretation. Addressing the variability in clinical outcome reports and integrating semantically similar outcomes is important in healthcare and clinical research. Variability in outcome reporting not only hinders the comparability of clinical trial results but also poses significant challenges in evidence synthesis, meta-analysis, and evidence-based decision-making. <b>Methods</b>: This study investigates variability reduction in outcome measures reporting using rule-based and large language-based models. It aims to mitigate the challenges associated with variability in outcome reporting by comparing these two models. The first approach, which is rule-based, will leverage well-known ontologies, and the second approach exploits sentence-bidirectional encoder representations from transformers (SBERT) to identify semantically similar outcomes along with Generative Pre-training Transformer (GPT) to refine the results. <b>Results</b>: The results show that the relatively low percentages of outcomes are linked to established rule-based ontologies. Analysis of outcomes by word count highlighted the absence of ontological linkage for three-word outcomes, which indicates potential gaps in semantic representation. <b>Conclusions</b>: Employing large language models (LLMs), this study demonstrates its ability to identify similar outcomes, even with more than three words, suggesting a crucial role in outcome harmonization efforts, potentially reducing redundancy and enhancing data interoperability. |
| format | Article |
| id | doaj-art-e8b3459bdabf4b5296d554146c8239a4 |
| institution | Kabale University |
| issn | 2673-7426 |
| language | English |
| publishDate | 2024-10-01 |
| publisher | MDPI AG |
| record_format | Article |
| series | BioMedInformatics |
| spelling | doaj-art-e8b3459bdabf4b5296d554146c8239a42024-12-27T14:13:18ZengMDPI AGBioMedInformatics2673-74262024-10-01442173218510.3390/biomedinformatics4040116Addressing Semantic Variability in Clinical Outcome Reporting Using Large Language ModelsFatemeh Shah-Mohammadi0Joseph Finkelstein1Department of Biomedical Informatics, School of Medicine, University of Utah, Salt Lake City, UT 84108, USADepartment of Biomedical Informatics, School of Medicine, University of Utah, Salt Lake City, UT 84108, USA<b>Background/Objectives</b>: Clinical trials frequently employ diverse terminologies and definitions to describe similar outcomes, leading to ambiguity and inconsistency in data interpretation. Addressing the variability in clinical outcome reports and integrating semantically similar outcomes is important in healthcare and clinical research. Variability in outcome reporting not only hinders the comparability of clinical trial results but also poses significant challenges in evidence synthesis, meta-analysis, and evidence-based decision-making. <b>Methods</b>: This study investigates variability reduction in outcome measures reporting using rule-based and large language-based models. It aims to mitigate the challenges associated with variability in outcome reporting by comparing these two models. The first approach, which is rule-based, will leverage well-known ontologies, and the second approach exploits sentence-bidirectional encoder representations from transformers (SBERT) to identify semantically similar outcomes along with Generative Pre-training Transformer (GPT) to refine the results. <b>Results</b>: The results show that the relatively low percentages of outcomes are linked to established rule-based ontologies. Analysis of outcomes by word count highlighted the absence of ontological linkage for three-word outcomes, which indicates potential gaps in semantic representation. <b>Conclusions</b>: Employing large language models (LLMs), this study demonstrates its ability to identify similar outcomes, even with more than three words, suggesting a crucial role in outcome harmonization efforts, potentially reducing redundancy and enhancing data interoperability.https://www.mdpi.com/2673-7426/4/4/116semantic variabilityclinical outcomeontologylarge language model |
| spellingShingle | Fatemeh Shah-Mohammadi Joseph Finkelstein Addressing Semantic Variability in Clinical Outcome Reporting Using Large Language Models BioMedInformatics semantic variability clinical outcome ontology large language model |
| title | Addressing Semantic Variability in Clinical Outcome Reporting Using Large Language Models |
| title_full | Addressing Semantic Variability in Clinical Outcome Reporting Using Large Language Models |
| title_fullStr | Addressing Semantic Variability in Clinical Outcome Reporting Using Large Language Models |
| title_full_unstemmed | Addressing Semantic Variability in Clinical Outcome Reporting Using Large Language Models |
| title_short | Addressing Semantic Variability in Clinical Outcome Reporting Using Large Language Models |
| title_sort | addressing semantic variability in clinical outcome reporting using large language models |
| topic | semantic variability clinical outcome ontology large language model |
| url | https://www.mdpi.com/2673-7426/4/4/116 |
| work_keys_str_mv | AT fatemehshahmohammadi addressingsemanticvariabilityinclinicaloutcomereportingusinglargelanguagemodels AT josephfinkelstein addressingsemanticvariabilityinclinicaloutcomereportingusinglargelanguagemodels |