Glaucoma Detection and Feature Identification via GPT-4V Fundus Image Analysis

Purpose: The aim is to assess GPT-4V's (OpenAI) diagnostic accuracy and its capability to identify glaucoma-related features compared to expert evaluations. Design: Evaluation of multimodal large language models for reviewing fundus images in glaucoma. Subjects: A total of 300 fundus images fro...

Full description

Saved in:

Bibliographic Details
Main Authors:	Jalil Jalili, PhD, Anuwat Jiravarnsirikul, MD, Christopher Bowd, PhD, Benton Chuter, MD, Akram Belghith, PhD, Michael H. Goldbaum, MD, Sally L. Baxter, MD, Robert N. Weinreb, MD, Linda M. Zangwill, PhD, Mark Christopher, PhD
Format:	Article
Language:	English
Published:	Elsevier 2025-03-01
Series:	Ophthalmology Science
Subjects:	Artificial intelligence Fundus image analysis Glaucoma detection GPT-4V Large multimodal models
Online Access:	http://www.sciencedirect.com/science/article/pii/S2666914524002033
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1841545495536730112
author	Jalil Jalili, PhD Anuwat Jiravarnsirikul, MD Christopher Bowd, PhD Benton Chuter, MD Akram Belghith, PhD Michael H. Goldbaum, MD Sally L. Baxter, MD Robert N. Weinreb, MD Linda M. Zangwill, PhD Mark Christopher, PhD
author_facet	Jalil Jalili, PhD Anuwat Jiravarnsirikul, MD Christopher Bowd, PhD Benton Chuter, MD Akram Belghith, PhD Michael H. Goldbaum, MD Sally L. Baxter, MD Robert N. Weinreb, MD Linda M. Zangwill, PhD Mark Christopher, PhD
author_sort	Jalil Jalili, PhD
collection	DOAJ
description	Purpose: The aim is to assess GPT-4V's (OpenAI) diagnostic accuracy and its capability to identify glaucoma-related features compared to expert evaluations. Design: Evaluation of multimodal large language models for reviewing fundus images in glaucoma. Subjects: A total of 300 fundus images from 3 public datasets (ACRIMA, ORIGA, and RIM-One v3) that included 139 glaucomatous and 161 nonglaucomatous cases were analyzed. Methods: Preprocessing ensured each image was centered on the optic disc. GPT-4's vision-preview model (GPT-4V) assessed each image for various glaucoma-related criteria: image quality, image gradability, cup-to-disc ratio, peripapillary atrophy, disc hemorrhages, rim thinning (by quadrant and clock hour), glaucoma status, and estimated probability of glaucoma. Each image was analyzed twice by GPT-4V to evaluate consistency in its predictions. Two expert graders independently evaluated the same images using identical criteria. Comparisons between GPT-4V's assessments, expert evaluations, and dataset labels were made to determine accuracy, sensitivity, specificity, and Cohen kappa. Main Outcome Measures: The main parameters measured were the accuracy, sensitivity, specificity, and Cohen kappa of GPT-4V in detecting glaucoma compared with expert evaluations. Results: GPT-4V successfully provided glaucoma assessments for all 300 fundus images across the datasets, although approximately 35% required multiple prompt submissions. GPT-4V's overall accuracy in glaucoma detection was slightly lower (0.68, 0.70, and 0.81, respectively) than that of expert graders (0.78, 0.80, and 0.88, for expert grader 1 and 0.72, 0.78, and 0.87, for expert grader 2, respectively), across the ACRIMA, ORIGA, and RIM-ONE datasets. In Glaucoma detection, GPT-4V showed variable agreement by dataset and expert graders, with Cohen kappa values ranging from 0.08 to 0.72. In terms of feature detection, GPT-4V demonstrated high consistency (repeatability) in image gradability, with an agreement accuracy of ≥89% and substantial agreement in rim thinning and cup-to-disc ratio assessments, although kappas were generally lower than expert-to-expert agreement. Conclusions: GPT-4V shows promise as a tool in glaucoma screening and detection through fundus image analysis, demonstrating generally high agreement with expert evaluations of key diagnostic features, although agreement did vary substantially across datasets. Financial Disclosure(s): Proprietary or commercial disclosure may be found in the Footnotes and Disclosures at the end of this article.
format	Article
id	doaj-art-3c1f903e5c9a4fc7badc40083db8d07d
institution	Kabale University
issn	2666-9145
language	English
publishDate	2025-03-01
publisher	Elsevier
record_format	Article
series	Ophthalmology Science
spelling	doaj-art-3c1f903e5c9a4fc7badc40083db8d07d2025-01-12T05:26:11ZengElsevierOphthalmology Science2666-91452025-03-0152100667Glaucoma Detection and Feature Identification via GPT-4V Fundus Image AnalysisJalil Jalili, PhD0Anuwat Jiravarnsirikul, MD1Christopher Bowd, PhD2Benton Chuter, MD3Akram Belghith, PhD4Michael H. Goldbaum, MD5Sally L. Baxter, MD6Robert N. Weinreb, MD7Linda M. Zangwill, PhD8Mark Christopher, PhD9Division of Ophthalmology Informatics and Data Science, Viterbi Family Department of Ophthalmology, Shiley Eye Institute, University of California, San Diego, La Jolla, California; Hamilton Glaucoma Center, Viterbi Family Department of Ophthalmology, Shiley Eye Institute, University of California, San Diego, La Jolla, CaliforniaHamilton Glaucoma Center, Viterbi Family Department of Ophthalmology, Shiley Eye Institute, University of California, San Diego, La Jolla, California; Faculty of Medicine Siriraj Hospital, Department of Ophthalmology, Mahidol University, Bangkok, ThailandDivision of Ophthalmology Informatics and Data Science, Viterbi Family Department of Ophthalmology, Shiley Eye Institute, University of California, San Diego, La Jolla, California; Hamilton Glaucoma Center, Viterbi Family Department of Ophthalmology, Shiley Eye Institute, University of California, San Diego, La Jolla, CaliforniaDivision of Ophthalmology Informatics and Data Science, Viterbi Family Department of Ophthalmology, Shiley Eye Institute, University of California, San Diego, La Jolla, California; Hamilton Glaucoma Center, Viterbi Family Department of Ophthalmology, Shiley Eye Institute, University of California, San Diego, La Jolla, CaliforniaDivision of Ophthalmology Informatics and Data Science, Viterbi Family Department of Ophthalmology, Shiley Eye Institute, University of California, San Diego, La Jolla, California; Hamilton Glaucoma Center, Viterbi Family Department of Ophthalmology, Shiley Eye Institute, University of California, San Diego, La Jolla, CaliforniaDivision of Ophthalmology Informatics and Data Science, Viterbi Family Department of Ophthalmology, Shiley Eye Institute, University of California, San Diego, La Jolla, California; Hamilton Glaucoma Center, Viterbi Family Department of Ophthalmology, Shiley Eye Institute, University of California, San Diego, La Jolla, CaliforniaDivision of Ophthalmology Informatics and Data Science, Viterbi Family Department of Ophthalmology, Shiley Eye Institute, University of California, San Diego, La Jolla, California; Hamilton Glaucoma Center, Viterbi Family Department of Ophthalmology, Shiley Eye Institute, University of California, San Diego, La Jolla, CaliforniaDivision of Ophthalmology Informatics and Data Science, Viterbi Family Department of Ophthalmology, Shiley Eye Institute, University of California, San Diego, La Jolla, California; Hamilton Glaucoma Center, Viterbi Family Department of Ophthalmology, Shiley Eye Institute, University of California, San Diego, La Jolla, CaliforniaDivision of Ophthalmology Informatics and Data Science, Viterbi Family Department of Ophthalmology, Shiley Eye Institute, University of California, San Diego, La Jolla, California; Hamilton Glaucoma Center, Viterbi Family Department of Ophthalmology, Shiley Eye Institute, University of California, San Diego, La Jolla, CaliforniaDivision of Ophthalmology Informatics and Data Science, Viterbi Family Department of Ophthalmology, Shiley Eye Institute, University of California, San Diego, La Jolla, California; Hamilton Glaucoma Center, Viterbi Family Department of Ophthalmology, Shiley Eye Institute, University of California, San Diego, La Jolla, California; Correspondence: Mark Christopher, PhD, University of California San Diego, 9500 Gilman St., San Diego, CA 92117.Purpose: The aim is to assess GPT-4V's (OpenAI) diagnostic accuracy and its capability to identify glaucoma-related features compared to expert evaluations. Design: Evaluation of multimodal large language models for reviewing fundus images in glaucoma. Subjects: A total of 300 fundus images from 3 public datasets (ACRIMA, ORIGA, and RIM-One v3) that included 139 glaucomatous and 161 nonglaucomatous cases were analyzed. Methods: Preprocessing ensured each image was centered on the optic disc. GPT-4's vision-preview model (GPT-4V) assessed each image for various glaucoma-related criteria: image quality, image gradability, cup-to-disc ratio, peripapillary atrophy, disc hemorrhages, rim thinning (by quadrant and clock hour), glaucoma status, and estimated probability of glaucoma. Each image was analyzed twice by GPT-4V to evaluate consistency in its predictions. Two expert graders independently evaluated the same images using identical criteria. Comparisons between GPT-4V's assessments, expert evaluations, and dataset labels were made to determine accuracy, sensitivity, specificity, and Cohen kappa. Main Outcome Measures: The main parameters measured were the accuracy, sensitivity, specificity, and Cohen kappa of GPT-4V in detecting glaucoma compared with expert evaluations. Results: GPT-4V successfully provided glaucoma assessments for all 300 fundus images across the datasets, although approximately 35% required multiple prompt submissions. GPT-4V's overall accuracy in glaucoma detection was slightly lower (0.68, 0.70, and 0.81, respectively) than that of expert graders (0.78, 0.80, and 0.88, for expert grader 1 and 0.72, 0.78, and 0.87, for expert grader 2, respectively), across the ACRIMA, ORIGA, and RIM-ONE datasets. In Glaucoma detection, GPT-4V showed variable agreement by dataset and expert graders, with Cohen kappa values ranging from 0.08 to 0.72. In terms of feature detection, GPT-4V demonstrated high consistency (repeatability) in image gradability, with an agreement accuracy of ≥89% and substantial agreement in rim thinning and cup-to-disc ratio assessments, although kappas were generally lower than expert-to-expert agreement. Conclusions: GPT-4V shows promise as a tool in glaucoma screening and detection through fundus image analysis, demonstrating generally high agreement with expert evaluations of key diagnostic features, although agreement did vary substantially across datasets. Financial Disclosure(s): Proprietary or commercial disclosure may be found in the Footnotes and Disclosures at the end of this article.http://www.sciencedirect.com/science/article/pii/S2666914524002033Artificial intelligenceFundus image analysisGlaucoma detectionGPT-4VLarge multimodal models
spellingShingle	Jalil Jalili, PhD Anuwat Jiravarnsirikul, MD Christopher Bowd, PhD Benton Chuter, MD Akram Belghith, PhD Michael H. Goldbaum, MD Sally L. Baxter, MD Robert N. Weinreb, MD Linda M. Zangwill, PhD Mark Christopher, PhD Glaucoma Detection and Feature Identification via GPT-4V Fundus Image Analysis Ophthalmology Science Artificial intelligence Fundus image analysis Glaucoma detection GPT-4V Large multimodal models
title	Glaucoma Detection and Feature Identification via GPT-4V Fundus Image Analysis
title_full	Glaucoma Detection and Feature Identification via GPT-4V Fundus Image Analysis
title_fullStr	Glaucoma Detection and Feature Identification via GPT-4V Fundus Image Analysis
title_full_unstemmed	Glaucoma Detection and Feature Identification via GPT-4V Fundus Image Analysis
title_short	Glaucoma Detection and Feature Identification via GPT-4V Fundus Image Analysis
title_sort	glaucoma detection and feature identification via gpt 4v fundus image analysis
topic	Artificial intelligence Fundus image analysis Glaucoma detection GPT-4V Large multimodal models
url	http://www.sciencedirect.com/science/article/pii/S2666914524002033
work_keys_str_mv	AT jaliljaliliphd glaucomadetectionandfeatureidentificationviagpt4vfundusimageanalysis AT anuwatjiravarnsirikulmd glaucomadetectionandfeatureidentificationviagpt4vfundusimageanalysis AT christopherbowdphd glaucomadetectionandfeatureidentificationviagpt4vfundusimageanalysis AT bentonchutermd glaucomadetectionandfeatureidentificationviagpt4vfundusimageanalysis AT akrambelghithphd glaucomadetectionandfeatureidentificationviagpt4vfundusimageanalysis AT michaelhgoldbaummd glaucomadetectionandfeatureidentificationviagpt4vfundusimageanalysis AT sallylbaxtermd glaucomadetectionandfeatureidentificationviagpt4vfundusimageanalysis AT robertnweinrebmd glaucomadetectionandfeatureidentificationviagpt4vfundusimageanalysis AT lindamzangwillphd glaucomadetectionandfeatureidentificationviagpt4vfundusimageanalysis AT markchristopherphd glaucomadetectionandfeatureidentificationviagpt4vfundusimageanalysis

Glaucoma Detection and Feature Identification via GPT-4V Fundus Image Analysis

Similar Items