Conversational LLM Chatbot ChatGPT-4 for Colonoscopy Boston Bowel Preparation Scoring: An Artificial Intelligence-to-Head Concordance Analysis

Background/objectives:To date, no studies have evaluated Chat Generative Pre-Trained Transformer (ChatGPT) as a large language model chatbot in optical applications for digestive endoscopy images. This study aimed to weigh the performance of ChatGPT-4 in assessing bowel preparation (BP) quality for...

Full description

Saved in:
Bibliographic Details
Main Authors: Raffaele Pellegrino, Alessandro Federico, Antonietta Gerarda Gravina
Format: Article
Language:English
Published: MDPI AG 2024-11-01
Series:Diagnostics
Subjects:
Online Access:https://www.mdpi.com/2075-4418/14/22/2537
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1846153774682341376
author Raffaele Pellegrino
Alessandro Federico
Antonietta Gerarda Gravina
author_facet Raffaele Pellegrino
Alessandro Federico
Antonietta Gerarda Gravina
author_sort Raffaele Pellegrino
collection DOAJ
description Background/objectives:To date, no studies have evaluated Chat Generative Pre-Trained Transformer (ChatGPT) as a large language model chatbot in optical applications for digestive endoscopy images. This study aimed to weigh the performance of ChatGPT-4 in assessing bowel preparation (BP) quality for colonoscopy. Methods: ChatGPT-4 analysed 663 anonymised endoscopic images, scoring each according to the Boston BP scale (BBPS). Expert physicians scored the same images subsequently. Results: ChatGPT-4 deemed 369 frames (62.9%) to be adequately prepared (i.e., BBPS > 1) compared to 524 frames (89.3%) assessed by human assessors. The agreement was slight (κ: 0.099, <i>p</i> = 0.0001). The raw human BBPS score was higher at 3 (2–3) than that of ChatGPT-4 at 2 (1–3), demonstrating moderate concordance (W: 0.554, <i>p</i> = 0.036). Conclusions: ChatGPT-4 demonstrates some potential in assessing BP on colonoscopy images, but further refinement is still needed.
format Article
id doaj-art-135cd8e594a845d0a52645fe792b34d3
institution Kabale University
issn 2075-4418
language English
publishDate 2024-11-01
publisher MDPI AG
record_format Article
series Diagnostics
spelling doaj-art-135cd8e594a845d0a52645fe792b34d32024-11-26T17:59:49ZengMDPI AGDiagnostics2075-44182024-11-011422253710.3390/diagnostics14222537Conversational LLM Chatbot ChatGPT-4 for Colonoscopy Boston Bowel Preparation Scoring: An Artificial Intelligence-to-Head Concordance AnalysisRaffaele Pellegrino0Alessandro Federico1Antonietta Gerarda Gravina2Hepatogastroenterology Division, Department of Precision Medicine, University of Campania Luigi Vanvitelli, Via L. de Crecchio, 80138 Naples, ItalyHepatogastroenterology Division, Department of Precision Medicine, University of Campania Luigi Vanvitelli, Via L. de Crecchio, 80138 Naples, ItalyHepatogastroenterology Division, Department of Precision Medicine, University of Campania Luigi Vanvitelli, Via L. de Crecchio, 80138 Naples, ItalyBackground/objectives:To date, no studies have evaluated Chat Generative Pre-Trained Transformer (ChatGPT) as a large language model chatbot in optical applications for digestive endoscopy images. This study aimed to weigh the performance of ChatGPT-4 in assessing bowel preparation (BP) quality for colonoscopy. Methods: ChatGPT-4 analysed 663 anonymised endoscopic images, scoring each according to the Boston BP scale (BBPS). Expert physicians scored the same images subsequently. Results: ChatGPT-4 deemed 369 frames (62.9%) to be adequately prepared (i.e., BBPS > 1) compared to 524 frames (89.3%) assessed by human assessors. The agreement was slight (κ: 0.099, <i>p</i> = 0.0001). The raw human BBPS score was higher at 3 (2–3) than that of ChatGPT-4 at 2 (1–3), demonstrating moderate concordance (W: 0.554, <i>p</i> = 0.036). Conclusions: ChatGPT-4 demonstrates some potential in assessing BP on colonoscopy images, but further refinement is still needed.https://www.mdpi.com/2075-4418/14/22/2537ChatGPTbowel preparationcolonoscopyartificial intelligence
spellingShingle Raffaele Pellegrino
Alessandro Federico
Antonietta Gerarda Gravina
Conversational LLM Chatbot ChatGPT-4 for Colonoscopy Boston Bowel Preparation Scoring: An Artificial Intelligence-to-Head Concordance Analysis
Diagnostics
ChatGPT
bowel preparation
colonoscopy
artificial intelligence
title Conversational LLM Chatbot ChatGPT-4 for Colonoscopy Boston Bowel Preparation Scoring: An Artificial Intelligence-to-Head Concordance Analysis
title_full Conversational LLM Chatbot ChatGPT-4 for Colonoscopy Boston Bowel Preparation Scoring: An Artificial Intelligence-to-Head Concordance Analysis
title_fullStr Conversational LLM Chatbot ChatGPT-4 for Colonoscopy Boston Bowel Preparation Scoring: An Artificial Intelligence-to-Head Concordance Analysis
title_full_unstemmed Conversational LLM Chatbot ChatGPT-4 for Colonoscopy Boston Bowel Preparation Scoring: An Artificial Intelligence-to-Head Concordance Analysis
title_short Conversational LLM Chatbot ChatGPT-4 for Colonoscopy Boston Bowel Preparation Scoring: An Artificial Intelligence-to-Head Concordance Analysis
title_sort conversational llm chatbot chatgpt 4 for colonoscopy boston bowel preparation scoring an artificial intelligence to head concordance analysis
topic ChatGPT
bowel preparation
colonoscopy
artificial intelligence
url https://www.mdpi.com/2075-4418/14/22/2537
work_keys_str_mv AT raffaelepellegrino conversationalllmchatbotchatgpt4forcolonoscopybostonbowelpreparationscoringanartificialintelligencetoheadconcordanceanalysis
AT alessandrofederico conversationalllmchatbotchatgpt4forcolonoscopybostonbowelpreparationscoringanartificialintelligencetoheadconcordanceanalysis
AT antoniettagerardagravina conversationalllmchatbotchatgpt4forcolonoscopybostonbowelpreparationscoringanartificialintelligencetoheadconcordanceanalysis