Glaucoma Detection and Feature Identification via GPT-4V Fundus Image Analysis

Purpose: The aim is to assess GPT-4V's (OpenAI) diagnostic accuracy and its capability to identify glaucoma-related features compared to expert evaluations. Design: Evaluation of multimodal large language models for reviewing fundus images in glaucoma. Subjects: A total of 300 fundus images fro...

Full description

Saved in:
Bibliographic Details
Main Authors: Jalil Jalili, PhD, Anuwat Jiravarnsirikul, MD, Christopher Bowd, PhD, Benton Chuter, MD, Akram Belghith, PhD, Michael H. Goldbaum, MD, Sally L. Baxter, MD, Robert N. Weinreb, MD, Linda M. Zangwill, PhD, Mark Christopher, PhD
Format: Article
Language:English
Published: Elsevier 2025-03-01
Series:Ophthalmology Science
Subjects:
Online Access:http://www.sciencedirect.com/science/article/pii/S2666914524002033
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1841545495536730112
author Jalil Jalili, PhD
Anuwat Jiravarnsirikul, MD
Christopher Bowd, PhD
Benton Chuter, MD
Akram Belghith, PhD
Michael H. Goldbaum, MD
Sally L. Baxter, MD
Robert N. Weinreb, MD
Linda M. Zangwill, PhD
Mark Christopher, PhD
author_facet Jalil Jalili, PhD
Anuwat Jiravarnsirikul, MD
Christopher Bowd, PhD
Benton Chuter, MD
Akram Belghith, PhD
Michael H. Goldbaum, MD
Sally L. Baxter, MD
Robert N. Weinreb, MD
Linda M. Zangwill, PhD
Mark Christopher, PhD
author_sort Jalil Jalili, PhD
collection DOAJ
description Purpose: The aim is to assess GPT-4V's (OpenAI) diagnostic accuracy and its capability to identify glaucoma-related features compared to expert evaluations. Design: Evaluation of multimodal large language models for reviewing fundus images in glaucoma. Subjects: A total of 300 fundus images from 3 public datasets (ACRIMA, ORIGA, and RIM-One v3) that included 139 glaucomatous and 161 nonglaucomatous cases were analyzed. Methods: Preprocessing ensured each image was centered on the optic disc. GPT-4's vision-preview model (GPT-4V) assessed each image for various glaucoma-related criteria: image quality, image gradability, cup-to-disc ratio, peripapillary atrophy, disc hemorrhages, rim thinning (by quadrant and clock hour), glaucoma status, and estimated probability of glaucoma. Each image was analyzed twice by GPT-4V to evaluate consistency in its predictions. Two expert graders independently evaluated the same images using identical criteria. Comparisons between GPT-4V's assessments, expert evaluations, and dataset labels were made to determine accuracy, sensitivity, specificity, and Cohen kappa. Main Outcome Measures: The main parameters measured were the accuracy, sensitivity, specificity, and Cohen kappa of GPT-4V in detecting glaucoma compared with expert evaluations. Results: GPT-4V successfully provided glaucoma assessments for all 300 fundus images across the datasets, although approximately 35% required multiple prompt submissions. GPT-4V's overall accuracy in glaucoma detection was slightly lower (0.68, 0.70, and 0.81, respectively) than that of expert graders (0.78, 0.80, and 0.88, for expert grader 1 and 0.72, 0.78, and 0.87, for expert grader 2, respectively), across the ACRIMA, ORIGA, and RIM-ONE datasets. In Glaucoma detection, GPT-4V showed variable agreement by dataset and expert graders, with Cohen kappa values ranging from 0.08 to 0.72. In terms of feature detection, GPT-4V demonstrated high consistency (repeatability) in image gradability, with an agreement accuracy of ≥89% and substantial agreement in rim thinning and cup-to-disc ratio assessments, although kappas were generally lower than expert-to-expert agreement. Conclusions: GPT-4V shows promise as a tool in glaucoma screening and detection through fundus image analysis, demonstrating generally high agreement with expert evaluations of key diagnostic features, although agreement did vary substantially across datasets. Financial Disclosure(s): Proprietary or commercial disclosure may be found in the Footnotes and Disclosures at the end of this article.
format Article
id doaj-art-3c1f903e5c9a4fc7badc40083db8d07d
institution Kabale University
issn 2666-9145
language English
publishDate 2025-03-01
publisher Elsevier
record_format Article
series Ophthalmology Science
spelling doaj-art-3c1f903e5c9a4fc7badc40083db8d07d2025-01-12T05:26:11ZengElsevierOphthalmology Science2666-91452025-03-0152100667Glaucoma Detection and Feature Identification via GPT-4V Fundus Image AnalysisJalil Jalili, PhD0Anuwat Jiravarnsirikul, MD1Christopher Bowd, PhD2Benton Chuter, MD3Akram Belghith, PhD4Michael H. Goldbaum, MD5Sally L. Baxter, MD6Robert N. Weinreb, MD7Linda M. Zangwill, PhD8Mark Christopher, PhD9Division of Ophthalmology Informatics and Data Science, Viterbi Family Department of Ophthalmology, Shiley Eye Institute, University of California, San Diego, La Jolla, California; Hamilton Glaucoma Center, Viterbi Family Department of Ophthalmology, Shiley Eye Institute, University of California, San Diego, La Jolla, CaliforniaHamilton Glaucoma Center, Viterbi Family Department of Ophthalmology, Shiley Eye Institute, University of California, San Diego, La Jolla, California; Faculty of Medicine Siriraj Hospital, Department of Ophthalmology, Mahidol University, Bangkok, ThailandDivision of Ophthalmology Informatics and Data Science, Viterbi Family Department of Ophthalmology, Shiley Eye Institute, University of California, San Diego, La Jolla, California; Hamilton Glaucoma Center, Viterbi Family Department of Ophthalmology, Shiley Eye Institute, University of California, San Diego, La Jolla, CaliforniaDivision of Ophthalmology Informatics and Data Science, Viterbi Family Department of Ophthalmology, Shiley Eye Institute, University of California, San Diego, La Jolla, California; Hamilton Glaucoma Center, Viterbi Family Department of Ophthalmology, Shiley Eye Institute, University of California, San Diego, La Jolla, CaliforniaDivision of Ophthalmology Informatics and Data Science, Viterbi Family Department of Ophthalmology, Shiley Eye Institute, University of California, San Diego, La Jolla, California; Hamilton Glaucoma Center, Viterbi Family Department of Ophthalmology, Shiley Eye Institute, University of California, San Diego, La Jolla, CaliforniaDivision of Ophthalmology Informatics and Data Science, Viterbi Family Department of Ophthalmology, Shiley Eye Institute, University of California, San Diego, La Jolla, California; Hamilton Glaucoma Center, Viterbi Family Department of Ophthalmology, Shiley Eye Institute, University of California, San Diego, La Jolla, CaliforniaDivision of Ophthalmology Informatics and Data Science, Viterbi Family Department of Ophthalmology, Shiley Eye Institute, University of California, San Diego, La Jolla, California; Hamilton Glaucoma Center, Viterbi Family Department of Ophthalmology, Shiley Eye Institute, University of California, San Diego, La Jolla, CaliforniaDivision of Ophthalmology Informatics and Data Science, Viterbi Family Department of Ophthalmology, Shiley Eye Institute, University of California, San Diego, La Jolla, California; Hamilton Glaucoma Center, Viterbi Family Department of Ophthalmology, Shiley Eye Institute, University of California, San Diego, La Jolla, CaliforniaDivision of Ophthalmology Informatics and Data Science, Viterbi Family Department of Ophthalmology, Shiley Eye Institute, University of California, San Diego, La Jolla, California; Hamilton Glaucoma Center, Viterbi Family Department of Ophthalmology, Shiley Eye Institute, University of California, San Diego, La Jolla, CaliforniaDivision of Ophthalmology Informatics and Data Science, Viterbi Family Department of Ophthalmology, Shiley Eye Institute, University of California, San Diego, La Jolla, California; Hamilton Glaucoma Center, Viterbi Family Department of Ophthalmology, Shiley Eye Institute, University of California, San Diego, La Jolla, California; Correspondence: Mark Christopher, PhD, University of California San Diego, 9500 Gilman St., San Diego, CA 92117.Purpose: The aim is to assess GPT-4V's (OpenAI) diagnostic accuracy and its capability to identify glaucoma-related features compared to expert evaluations. Design: Evaluation of multimodal large language models for reviewing fundus images in glaucoma. Subjects: A total of 300 fundus images from 3 public datasets (ACRIMA, ORIGA, and RIM-One v3) that included 139 glaucomatous and 161 nonglaucomatous cases were analyzed. Methods: Preprocessing ensured each image was centered on the optic disc. GPT-4's vision-preview model (GPT-4V) assessed each image for various glaucoma-related criteria: image quality, image gradability, cup-to-disc ratio, peripapillary atrophy, disc hemorrhages, rim thinning (by quadrant and clock hour), glaucoma status, and estimated probability of glaucoma. Each image was analyzed twice by GPT-4V to evaluate consistency in its predictions. Two expert graders independently evaluated the same images using identical criteria. Comparisons between GPT-4V's assessments, expert evaluations, and dataset labels were made to determine accuracy, sensitivity, specificity, and Cohen kappa. Main Outcome Measures: The main parameters measured were the accuracy, sensitivity, specificity, and Cohen kappa of GPT-4V in detecting glaucoma compared with expert evaluations. Results: GPT-4V successfully provided glaucoma assessments for all 300 fundus images across the datasets, although approximately 35% required multiple prompt submissions. GPT-4V's overall accuracy in glaucoma detection was slightly lower (0.68, 0.70, and 0.81, respectively) than that of expert graders (0.78, 0.80, and 0.88, for expert grader 1 and 0.72, 0.78, and 0.87, for expert grader 2, respectively), across the ACRIMA, ORIGA, and RIM-ONE datasets. In Glaucoma detection, GPT-4V showed variable agreement by dataset and expert graders, with Cohen kappa values ranging from 0.08 to 0.72. In terms of feature detection, GPT-4V demonstrated high consistency (repeatability) in image gradability, with an agreement accuracy of ≥89% and substantial agreement in rim thinning and cup-to-disc ratio assessments, although kappas were generally lower than expert-to-expert agreement. Conclusions: GPT-4V shows promise as a tool in glaucoma screening and detection through fundus image analysis, demonstrating generally high agreement with expert evaluations of key diagnostic features, although agreement did vary substantially across datasets. Financial Disclosure(s): Proprietary or commercial disclosure may be found in the Footnotes and Disclosures at the end of this article.http://www.sciencedirect.com/science/article/pii/S2666914524002033Artificial intelligenceFundus image analysisGlaucoma detectionGPT-4VLarge multimodal models
spellingShingle Jalil Jalili, PhD
Anuwat Jiravarnsirikul, MD
Christopher Bowd, PhD
Benton Chuter, MD
Akram Belghith, PhD
Michael H. Goldbaum, MD
Sally L. Baxter, MD
Robert N. Weinreb, MD
Linda M. Zangwill, PhD
Mark Christopher, PhD
Glaucoma Detection and Feature Identification via GPT-4V Fundus Image Analysis
Ophthalmology Science
Artificial intelligence
Fundus image analysis
Glaucoma detection
GPT-4V
Large multimodal models
title Glaucoma Detection and Feature Identification via GPT-4V Fundus Image Analysis
title_full Glaucoma Detection and Feature Identification via GPT-4V Fundus Image Analysis
title_fullStr Glaucoma Detection and Feature Identification via GPT-4V Fundus Image Analysis
title_full_unstemmed Glaucoma Detection and Feature Identification via GPT-4V Fundus Image Analysis
title_short Glaucoma Detection and Feature Identification via GPT-4V Fundus Image Analysis
title_sort glaucoma detection and feature identification via gpt 4v fundus image analysis
topic Artificial intelligence
Fundus image analysis
Glaucoma detection
GPT-4V
Large multimodal models
url http://www.sciencedirect.com/science/article/pii/S2666914524002033
work_keys_str_mv AT jaliljaliliphd glaucomadetectionandfeatureidentificationviagpt4vfundusimageanalysis
AT anuwatjiravarnsirikulmd glaucomadetectionandfeatureidentificationviagpt4vfundusimageanalysis
AT christopherbowdphd glaucomadetectionandfeatureidentificationviagpt4vfundusimageanalysis
AT bentonchutermd glaucomadetectionandfeatureidentificationviagpt4vfundusimageanalysis
AT akrambelghithphd glaucomadetectionandfeatureidentificationviagpt4vfundusimageanalysis
AT michaelhgoldbaummd glaucomadetectionandfeatureidentificationviagpt4vfundusimageanalysis
AT sallylbaxtermd glaucomadetectionandfeatureidentificationviagpt4vfundusimageanalysis
AT robertnweinrebmd glaucomadetectionandfeatureidentificationviagpt4vfundusimageanalysis
AT lindamzangwillphd glaucomadetectionandfeatureidentificationviagpt4vfundusimageanalysis
AT markchristopherphd glaucomadetectionandfeatureidentificationviagpt4vfundusimageanalysis