Conversational LLM Chatbot ChatGPT-4 for Colonoscopy Boston Bowel Preparation Scoring: An Artificial Intelligence-to-Head Concordance Analysis
Background/objectives:To date, no studies have evaluated Chat Generative Pre-Trained Transformer (ChatGPT) as a large language model chatbot in optical applications for digestive endoscopy images. This study aimed to weigh the performance of ChatGPT-4 in assessing bowel preparation (BP) quality for...
Saved in:
| Main Authors: | , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
MDPI AG
2024-11-01
|
| Series: | Diagnostics |
| Subjects: | |
| Online Access: | https://www.mdpi.com/2075-4418/14/22/2537 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1846153774682341376 |
|---|---|
| author | Raffaele Pellegrino Alessandro Federico Antonietta Gerarda Gravina |
| author_facet | Raffaele Pellegrino Alessandro Federico Antonietta Gerarda Gravina |
| author_sort | Raffaele Pellegrino |
| collection | DOAJ |
| description | Background/objectives:To date, no studies have evaluated Chat Generative Pre-Trained Transformer (ChatGPT) as a large language model chatbot in optical applications for digestive endoscopy images. This study aimed to weigh the performance of ChatGPT-4 in assessing bowel preparation (BP) quality for colonoscopy. Methods: ChatGPT-4 analysed 663 anonymised endoscopic images, scoring each according to the Boston BP scale (BBPS). Expert physicians scored the same images subsequently. Results: ChatGPT-4 deemed 369 frames (62.9%) to be adequately prepared (i.e., BBPS > 1) compared to 524 frames (89.3%) assessed by human assessors. The agreement was slight (κ: 0.099, <i>p</i> = 0.0001). The raw human BBPS score was higher at 3 (2–3) than that of ChatGPT-4 at 2 (1–3), demonstrating moderate concordance (W: 0.554, <i>p</i> = 0.036). Conclusions: ChatGPT-4 demonstrates some potential in assessing BP on colonoscopy images, but further refinement is still needed. |
| format | Article |
| id | doaj-art-135cd8e594a845d0a52645fe792b34d3 |
| institution | Kabale University |
| issn | 2075-4418 |
| language | English |
| publishDate | 2024-11-01 |
| publisher | MDPI AG |
| record_format | Article |
| series | Diagnostics |
| spelling | doaj-art-135cd8e594a845d0a52645fe792b34d32024-11-26T17:59:49ZengMDPI AGDiagnostics2075-44182024-11-011422253710.3390/diagnostics14222537Conversational LLM Chatbot ChatGPT-4 for Colonoscopy Boston Bowel Preparation Scoring: An Artificial Intelligence-to-Head Concordance AnalysisRaffaele Pellegrino0Alessandro Federico1Antonietta Gerarda Gravina2Hepatogastroenterology Division, Department of Precision Medicine, University of Campania Luigi Vanvitelli, Via L. de Crecchio, 80138 Naples, ItalyHepatogastroenterology Division, Department of Precision Medicine, University of Campania Luigi Vanvitelli, Via L. de Crecchio, 80138 Naples, ItalyHepatogastroenterology Division, Department of Precision Medicine, University of Campania Luigi Vanvitelli, Via L. de Crecchio, 80138 Naples, ItalyBackground/objectives:To date, no studies have evaluated Chat Generative Pre-Trained Transformer (ChatGPT) as a large language model chatbot in optical applications for digestive endoscopy images. This study aimed to weigh the performance of ChatGPT-4 in assessing bowel preparation (BP) quality for colonoscopy. Methods: ChatGPT-4 analysed 663 anonymised endoscopic images, scoring each according to the Boston BP scale (BBPS). Expert physicians scored the same images subsequently. Results: ChatGPT-4 deemed 369 frames (62.9%) to be adequately prepared (i.e., BBPS > 1) compared to 524 frames (89.3%) assessed by human assessors. The agreement was slight (κ: 0.099, <i>p</i> = 0.0001). The raw human BBPS score was higher at 3 (2–3) than that of ChatGPT-4 at 2 (1–3), demonstrating moderate concordance (W: 0.554, <i>p</i> = 0.036). Conclusions: ChatGPT-4 demonstrates some potential in assessing BP on colonoscopy images, but further refinement is still needed.https://www.mdpi.com/2075-4418/14/22/2537ChatGPTbowel preparationcolonoscopyartificial intelligence |
| spellingShingle | Raffaele Pellegrino Alessandro Federico Antonietta Gerarda Gravina Conversational LLM Chatbot ChatGPT-4 for Colonoscopy Boston Bowel Preparation Scoring: An Artificial Intelligence-to-Head Concordance Analysis Diagnostics ChatGPT bowel preparation colonoscopy artificial intelligence |
| title | Conversational LLM Chatbot ChatGPT-4 for Colonoscopy Boston Bowel Preparation Scoring: An Artificial Intelligence-to-Head Concordance Analysis |
| title_full | Conversational LLM Chatbot ChatGPT-4 for Colonoscopy Boston Bowel Preparation Scoring: An Artificial Intelligence-to-Head Concordance Analysis |
| title_fullStr | Conversational LLM Chatbot ChatGPT-4 for Colonoscopy Boston Bowel Preparation Scoring: An Artificial Intelligence-to-Head Concordance Analysis |
| title_full_unstemmed | Conversational LLM Chatbot ChatGPT-4 for Colonoscopy Boston Bowel Preparation Scoring: An Artificial Intelligence-to-Head Concordance Analysis |
| title_short | Conversational LLM Chatbot ChatGPT-4 for Colonoscopy Boston Bowel Preparation Scoring: An Artificial Intelligence-to-Head Concordance Analysis |
| title_sort | conversational llm chatbot chatgpt 4 for colonoscopy boston bowel preparation scoring an artificial intelligence to head concordance analysis |
| topic | ChatGPT bowel preparation colonoscopy artificial intelligence |
| url | https://www.mdpi.com/2075-4418/14/22/2537 |
| work_keys_str_mv | AT raffaelepellegrino conversationalllmchatbotchatgpt4forcolonoscopybostonbowelpreparationscoringanartificialintelligencetoheadconcordanceanalysis AT alessandrofederico conversationalllmchatbotchatgpt4forcolonoscopybostonbowelpreparationscoringanartificialintelligencetoheadconcordanceanalysis AT antoniettagerardagravina conversationalllmchatbotchatgpt4forcolonoscopybostonbowelpreparationscoringanartificialintelligencetoheadconcordanceanalysis |