Evaluating ChatGPT-4o as a decision support tool in multidisciplinary sarcoma tumor boards: heterogeneous performance across various specialties

Background and objectivesSince the launch of ChatGPT in 2023, large language models have attracted substantial interest to be deployed in the health care sector. This study evaluates the performance of ChatGPT-4o as a support tool for decision-making in multidisciplinary sarcoma tumor boards.Methods...

Full description

Saved in:

Bibliographic Details
Main Authors:	Tekoshin Ammo, Vincent G. J. Guillaume, Ulf Krister Hofmann, Norma M. Ulmer, Nina Buenting, Florian Laenger, Justus P. Beier, Tim Leypold
Format:	Article
Language:	English
Published:	Frontiers Media S.A. 2025-01-01
Series:	Frontiers in Oncology
Subjects:	sarcoma multidisciplinary sarcoma tumor board artificial intelligence chat-GPT large language models cancer
Online Access:	https://www.frontiersin.org/articles/10.3389/fonc.2024.1526288/full
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1841525886759731200
author	Tekoshin Ammo Vincent G. J. Guillaume Ulf Krister Hofmann Norma M. Ulmer Nina Buenting Florian Laenger Justus P. Beier Tim Leypold
author_facet	Tekoshin Ammo Vincent G. J. Guillaume Ulf Krister Hofmann Norma M. Ulmer Nina Buenting Florian Laenger Justus P. Beier Tim Leypold
author_sort	Tekoshin Ammo
collection	DOAJ
description	Background and objectivesSince the launch of ChatGPT in 2023, large language models have attracted substantial interest to be deployed in the health care sector. This study evaluates the performance of ChatGPT-4o as a support tool for decision-making in multidisciplinary sarcoma tumor boards.MethodsWe created five sarcoma patient cases mimicking real-world scenarios and prompted ChatGPT-4o to issue tumor board decisions. These recommendations were independently assessed by a multidisciplinary panel, consisting of an orthopedic surgeon, plastic surgeon, radiation oncologist, radiologist, and pathologist. Assessments were graded on a Likert scale from 1 (completely disagree) to 5 (completely agree) across five categories: understanding, therapy/diagnostic recommendation, aftercare recommendation, summarization, and support tool effectiveness.ResultsThe mean score for ChatGPT-4o performance was 3.76, indicating moderate effectiveness. Surgical specialties received the highest score, with a mean score of 4.48, while diagnostic specialties (radiology/pathology) performed considerably better than the radiation oncology specialty, which performed poorly.ConclusionsThis study provides initial insights into the use of prompt-engineered large language models as decision support tools in sarcoma tumor boards. ChatGPT-4o recommendations regarding surgical specialties performed best while ChatGPT-4o struggled to give valuable advice in the other tested specialties. Clinicians should understand both the advantages and limitations of this technology for effective integration into clinical practice.
format	Article
id	doaj-art-dcce96e2d74f41b1b0501788a71f6ccc
institution	Kabale University
issn	2234-943X
language	English
publishDate	2025-01-01
publisher	Frontiers Media S.A.
record_format	Article
series	Frontiers in Oncology
spelling	doaj-art-dcce96e2d74f41b1b0501788a71f6ccc2025-01-17T06:50:55ZengFrontiers Media S.A.Frontiers in Oncology2234-943X2025-01-011410.3389/fonc.2024.15262881526288Evaluating ChatGPT-4o as a decision support tool in multidisciplinary sarcoma tumor boards: heterogeneous performance across various specialtiesTekoshin Ammo0Vincent G. J. Guillaume1Ulf Krister Hofmann2Norma M. Ulmer3Nina Buenting4Florian Laenger5Justus P. Beier6Tim Leypold7Department of Plastic Surgery, Hand and Reconstructive Surgery, University Hospital Rheinisch-Westfälische Technische Hochschule (RWTH) Aachen, Aachen, GermanyDepartment of Plastic Surgery, Hand and Reconstructive Surgery, University Hospital Rheinisch-Westfälische Technische Hochschule (RWTH) Aachen, Aachen, GermanyDepartment of Orthopedics, Trauma and Reconstructive Surgery, Division of Arthroplasty, University Hospital Rheinisch-Westfälische Technische Hochschule (RWTH) Aachen, Aachen, GermanyDepartment of Radiation Oncology, University Hospital Rheinisch-Westfälische Technische Hochschule (RWTH) Aachen, Aachen, GermanyDepartment of Diagnostic and Interventional Radiology, University Hospital Rheinisch-Westfälische Technische Hochschule (RWTH) Aachen, Aachen, GermanyInstitute of Pathology, University Hospital Rheinisch-Westfälische Technische Hochschule (RWTH) Aachen, Aachen, GermanyDepartment of Plastic Surgery, Hand and Reconstructive Surgery, University Hospital Rheinisch-Westfälische Technische Hochschule (RWTH) Aachen, Aachen, GermanyDepartment of Plastic Surgery, Hand and Reconstructive Surgery, University Hospital Rheinisch-Westfälische Technische Hochschule (RWTH) Aachen, Aachen, GermanyBackground and objectivesSince the launch of ChatGPT in 2023, large language models have attracted substantial interest to be deployed in the health care sector. This study evaluates the performance of ChatGPT-4o as a support tool for decision-making in multidisciplinary sarcoma tumor boards.MethodsWe created five sarcoma patient cases mimicking real-world scenarios and prompted ChatGPT-4o to issue tumor board decisions. These recommendations were independently assessed by a multidisciplinary panel, consisting of an orthopedic surgeon, plastic surgeon, radiation oncologist, radiologist, and pathologist. Assessments were graded on a Likert scale from 1 (completely disagree) to 5 (completely agree) across five categories: understanding, therapy/diagnostic recommendation, aftercare recommendation, summarization, and support tool effectiveness.ResultsThe mean score for ChatGPT-4o performance was 3.76, indicating moderate effectiveness. Surgical specialties received the highest score, with a mean score of 4.48, while diagnostic specialties (radiology/pathology) performed considerably better than the radiation oncology specialty, which performed poorly.ConclusionsThis study provides initial insights into the use of prompt-engineered large language models as decision support tools in sarcoma tumor boards. ChatGPT-4o recommendations regarding surgical specialties performed best while ChatGPT-4o struggled to give valuable advice in the other tested specialties. Clinicians should understand both the advantages and limitations of this technology for effective integration into clinical practice.https://www.frontiersin.org/articles/10.3389/fonc.2024.1526288/fullsarcomamultidisciplinary sarcoma tumor boardartificial intelligencechat-GPTlarge language modelscancer
spellingShingle	Tekoshin Ammo Vincent G. J. Guillaume Ulf Krister Hofmann Norma M. Ulmer Nina Buenting Florian Laenger Justus P. Beier Tim Leypold Evaluating ChatGPT-4o as a decision support tool in multidisciplinary sarcoma tumor boards: heterogeneous performance across various specialties Frontiers in Oncology sarcoma multidisciplinary sarcoma tumor board artificial intelligence chat-GPT large language models cancer
title	Evaluating ChatGPT-4o as a decision support tool in multidisciplinary sarcoma tumor boards: heterogeneous performance across various specialties
title_full	Evaluating ChatGPT-4o as a decision support tool in multidisciplinary sarcoma tumor boards: heterogeneous performance across various specialties
title_fullStr	Evaluating ChatGPT-4o as a decision support tool in multidisciplinary sarcoma tumor boards: heterogeneous performance across various specialties
title_full_unstemmed	Evaluating ChatGPT-4o as a decision support tool in multidisciplinary sarcoma tumor boards: heterogeneous performance across various specialties
title_short	Evaluating ChatGPT-4o as a decision support tool in multidisciplinary sarcoma tumor boards: heterogeneous performance across various specialties
title_sort	evaluating chatgpt 4o as a decision support tool in multidisciplinary sarcoma tumor boards heterogeneous performance across various specialties
topic	sarcoma multidisciplinary sarcoma tumor board artificial intelligence chat-GPT large language models cancer
url	https://www.frontiersin.org/articles/10.3389/fonc.2024.1526288/full
work_keys_str_mv	AT tekoshinammo evaluatingchatgpt4oasadecisionsupporttoolinmultidisciplinarysarcomatumorboardsheterogeneousperformanceacrossvariousspecialties AT vincentgjguillaume evaluatingchatgpt4oasadecisionsupporttoolinmultidisciplinarysarcomatumorboardsheterogeneousperformanceacrossvariousspecialties AT ulfkristerhofmann evaluatingchatgpt4oasadecisionsupporttoolinmultidisciplinarysarcomatumorboardsheterogeneousperformanceacrossvariousspecialties AT normamulmer evaluatingchatgpt4oasadecisionsupporttoolinmultidisciplinarysarcomatumorboardsheterogeneousperformanceacrossvariousspecialties AT ninabuenting evaluatingchatgpt4oasadecisionsupporttoolinmultidisciplinarysarcomatumorboardsheterogeneousperformanceacrossvariousspecialties AT florianlaenger evaluatingchatgpt4oasadecisionsupporttoolinmultidisciplinarysarcomatumorboardsheterogeneousperformanceacrossvariousspecialties AT justuspbeier evaluatingchatgpt4oasadecisionsupporttoolinmultidisciplinarysarcomatumorboardsheterogeneousperformanceacrossvariousspecialties AT timleypold evaluatingchatgpt4oasadecisionsupporttoolinmultidisciplinarysarcomatumorboardsheterogeneousperformanceacrossvariousspecialties

Evaluating ChatGPT-4o as a decision support tool in multidisciplinary sarcoma tumor boards: heterogeneous performance across various specialties

Similar Items