Conversational LLM Chatbot ChatGPT-4 for Colonoscopy Boston Bowel Preparation Scoring: An Artificial Intelligence-to-Head Concordance Analysis

Background/objectives:To date, no studies have evaluated Chat Generative Pre-Trained Transformer (ChatGPT) as a large language model chatbot in optical applications for digestive endoscopy images. This study aimed to weigh the performance of ChatGPT-4 in assessing bowel preparation (BP) quality for...

Full description

Saved in:

Bibliographic Details
Main Authors:	Raffaele Pellegrino, Alessandro Federico, Antonietta Gerarda Gravina
Format:	Article
Language:	English
Published:	MDPI AG 2024-11-01
Series:	Diagnostics
Subjects:	ChatGPT bowel preparation colonoscopy artificial intelligence
Online Access:	https://www.mdpi.com/2075-4418/14/22/2537
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	Background/objectives:To date, no studies have evaluated Chat Generative Pre-Trained Transformer (ChatGPT) as a large language model chatbot in optical applications for digestive endoscopy images. This study aimed to weigh the performance of ChatGPT-4 in assessing bowel preparation (BP) quality for colonoscopy. Methods: ChatGPT-4 analysed 663 anonymised endoscopic images, scoring each according to the Boston BP scale (BBPS). Expert physicians scored the same images subsequently. Results: ChatGPT-4 deemed 369 frames (62.9%) to be adequately prepared (i.e., BBPS > 1) compared to 524 frames (89.3%) assessed by human assessors. The agreement was slight (κ: 0.099, <i>p</i> = 0.0001). The raw human BBPS score was higher at 3 (2–3) than that of ChatGPT-4 at 2 (1–3), demonstrating moderate concordance (W: 0.554, <i>p</i> = 0.036). Conclusions: ChatGPT-4 demonstrates some potential in assessing BP on colonoscopy images, but further refinement is still needed.
ISSN:	2075-4418

Conversational LLM Chatbot ChatGPT-4 for Colonoscopy Boston Bowel Preparation Scoring: An Artificial Intelligence-to-Head Concordance Analysis

Similar Items