Glaucoma Detection and Feature Identification via GPT-4V Fundus Image Analysis
Purpose: The aim is to assess GPT-4V's (OpenAI) diagnostic accuracy and its capability to identify glaucoma-related features compared to expert evaluations. Design: Evaluation of multimodal large language models for reviewing fundus images in glaucoma. Subjects: A total of 300 fundus images fro...
Saved in:
Main Authors: | , , , , , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Elsevier
2025-03-01
|
Series: | Ophthalmology Science |
Subjects: | |
Online Access: | http://www.sciencedirect.com/science/article/pii/S2666914524002033 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
_version_ | 1841545495536730112 |
---|---|
author | Jalil Jalili, PhD Anuwat Jiravarnsirikul, MD Christopher Bowd, PhD Benton Chuter, MD Akram Belghith, PhD Michael H. Goldbaum, MD Sally L. Baxter, MD Robert N. Weinreb, MD Linda M. Zangwill, PhD Mark Christopher, PhD |
author_facet | Jalil Jalili, PhD Anuwat Jiravarnsirikul, MD Christopher Bowd, PhD Benton Chuter, MD Akram Belghith, PhD Michael H. Goldbaum, MD Sally L. Baxter, MD Robert N. Weinreb, MD Linda M. Zangwill, PhD Mark Christopher, PhD |
author_sort | Jalil Jalili, PhD |
collection | DOAJ |
description | Purpose: The aim is to assess GPT-4V's (OpenAI) diagnostic accuracy and its capability to identify glaucoma-related features compared to expert evaluations. Design: Evaluation of multimodal large language models for reviewing fundus images in glaucoma. Subjects: A total of 300 fundus images from 3 public datasets (ACRIMA, ORIGA, and RIM-One v3) that included 139 glaucomatous and 161 nonglaucomatous cases were analyzed. Methods: Preprocessing ensured each image was centered on the optic disc. GPT-4's vision-preview model (GPT-4V) assessed each image for various glaucoma-related criteria: image quality, image gradability, cup-to-disc ratio, peripapillary atrophy, disc hemorrhages, rim thinning (by quadrant and clock hour), glaucoma status, and estimated probability of glaucoma. Each image was analyzed twice by GPT-4V to evaluate consistency in its predictions. Two expert graders independently evaluated the same images using identical criteria. Comparisons between GPT-4V's assessments, expert evaluations, and dataset labels were made to determine accuracy, sensitivity, specificity, and Cohen kappa. Main Outcome Measures: The main parameters measured were the accuracy, sensitivity, specificity, and Cohen kappa of GPT-4V in detecting glaucoma compared with expert evaluations. Results: GPT-4V successfully provided glaucoma assessments for all 300 fundus images across the datasets, although approximately 35% required multiple prompt submissions. GPT-4V's overall accuracy in glaucoma detection was slightly lower (0.68, 0.70, and 0.81, respectively) than that of expert graders (0.78, 0.80, and 0.88, for expert grader 1 and 0.72, 0.78, and 0.87, for expert grader 2, respectively), across the ACRIMA, ORIGA, and RIM-ONE datasets. In Glaucoma detection, GPT-4V showed variable agreement by dataset and expert graders, with Cohen kappa values ranging from 0.08 to 0.72. In terms of feature detection, GPT-4V demonstrated high consistency (repeatability) in image gradability, with an agreement accuracy of ≥89% and substantial agreement in rim thinning and cup-to-disc ratio assessments, although kappas were generally lower than expert-to-expert agreement. Conclusions: GPT-4V shows promise as a tool in glaucoma screening and detection through fundus image analysis, demonstrating generally high agreement with expert evaluations of key diagnostic features, although agreement did vary substantially across datasets. Financial Disclosure(s): Proprietary or commercial disclosure may be found in the Footnotes and Disclosures at the end of this article. |
format | Article |
id | doaj-art-3c1f903e5c9a4fc7badc40083db8d07d |
institution | Kabale University |
issn | 2666-9145 |
language | English |
publishDate | 2025-03-01 |
publisher | Elsevier |
record_format | Article |
series | Ophthalmology Science |
spelling | doaj-art-3c1f903e5c9a4fc7badc40083db8d07d2025-01-12T05:26:11ZengElsevierOphthalmology Science2666-91452025-03-0152100667Glaucoma Detection and Feature Identification via GPT-4V Fundus Image AnalysisJalil Jalili, PhD0Anuwat Jiravarnsirikul, MD1Christopher Bowd, PhD2Benton Chuter, MD3Akram Belghith, PhD4Michael H. Goldbaum, MD5Sally L. Baxter, MD6Robert N. Weinreb, MD7Linda M. Zangwill, PhD8Mark Christopher, PhD9Division of Ophthalmology Informatics and Data Science, Viterbi Family Department of Ophthalmology, Shiley Eye Institute, University of California, San Diego, La Jolla, California; Hamilton Glaucoma Center, Viterbi Family Department of Ophthalmology, Shiley Eye Institute, University of California, San Diego, La Jolla, CaliforniaHamilton Glaucoma Center, Viterbi Family Department of Ophthalmology, Shiley Eye Institute, University of California, San Diego, La Jolla, California; Faculty of Medicine Siriraj Hospital, Department of Ophthalmology, Mahidol University, Bangkok, ThailandDivision of Ophthalmology Informatics and Data Science, Viterbi Family Department of Ophthalmology, Shiley Eye Institute, University of California, San Diego, La Jolla, California; Hamilton Glaucoma Center, Viterbi Family Department of Ophthalmology, Shiley Eye Institute, University of California, San Diego, La Jolla, CaliforniaDivision of Ophthalmology Informatics and Data Science, Viterbi Family Department of Ophthalmology, Shiley Eye Institute, University of California, San Diego, La Jolla, California; Hamilton Glaucoma Center, Viterbi Family Department of Ophthalmology, Shiley Eye Institute, University of California, San Diego, La Jolla, CaliforniaDivision of Ophthalmology Informatics and Data Science, Viterbi Family Department of Ophthalmology, Shiley Eye Institute, University of California, San Diego, La Jolla, California; Hamilton Glaucoma Center, Viterbi Family Department of Ophthalmology, Shiley Eye Institute, University of California, San Diego, La Jolla, CaliforniaDivision of Ophthalmology Informatics and Data Science, Viterbi Family Department of Ophthalmology, Shiley Eye Institute, University of California, San Diego, La Jolla, California; Hamilton Glaucoma Center, Viterbi Family Department of Ophthalmology, Shiley Eye Institute, University of California, San Diego, La Jolla, CaliforniaDivision of Ophthalmology Informatics and Data Science, Viterbi Family Department of Ophthalmology, Shiley Eye Institute, University of California, San Diego, La Jolla, California; Hamilton Glaucoma Center, Viterbi Family Department of Ophthalmology, Shiley Eye Institute, University of California, San Diego, La Jolla, CaliforniaDivision of Ophthalmology Informatics and Data Science, Viterbi Family Department of Ophthalmology, Shiley Eye Institute, University of California, San Diego, La Jolla, California; Hamilton Glaucoma Center, Viterbi Family Department of Ophthalmology, Shiley Eye Institute, University of California, San Diego, La Jolla, CaliforniaDivision of Ophthalmology Informatics and Data Science, Viterbi Family Department of Ophthalmology, Shiley Eye Institute, University of California, San Diego, La Jolla, California; Hamilton Glaucoma Center, Viterbi Family Department of Ophthalmology, Shiley Eye Institute, University of California, San Diego, La Jolla, CaliforniaDivision of Ophthalmology Informatics and Data Science, Viterbi Family Department of Ophthalmology, Shiley Eye Institute, University of California, San Diego, La Jolla, California; Hamilton Glaucoma Center, Viterbi Family Department of Ophthalmology, Shiley Eye Institute, University of California, San Diego, La Jolla, California; Correspondence: Mark Christopher, PhD, University of California San Diego, 9500 Gilman St., San Diego, CA 92117.Purpose: The aim is to assess GPT-4V's (OpenAI) diagnostic accuracy and its capability to identify glaucoma-related features compared to expert evaluations. Design: Evaluation of multimodal large language models for reviewing fundus images in glaucoma. Subjects: A total of 300 fundus images from 3 public datasets (ACRIMA, ORIGA, and RIM-One v3) that included 139 glaucomatous and 161 nonglaucomatous cases were analyzed. Methods: Preprocessing ensured each image was centered on the optic disc. GPT-4's vision-preview model (GPT-4V) assessed each image for various glaucoma-related criteria: image quality, image gradability, cup-to-disc ratio, peripapillary atrophy, disc hemorrhages, rim thinning (by quadrant and clock hour), glaucoma status, and estimated probability of glaucoma. Each image was analyzed twice by GPT-4V to evaluate consistency in its predictions. Two expert graders independently evaluated the same images using identical criteria. Comparisons between GPT-4V's assessments, expert evaluations, and dataset labels were made to determine accuracy, sensitivity, specificity, and Cohen kappa. Main Outcome Measures: The main parameters measured were the accuracy, sensitivity, specificity, and Cohen kappa of GPT-4V in detecting glaucoma compared with expert evaluations. Results: GPT-4V successfully provided glaucoma assessments for all 300 fundus images across the datasets, although approximately 35% required multiple prompt submissions. GPT-4V's overall accuracy in glaucoma detection was slightly lower (0.68, 0.70, and 0.81, respectively) than that of expert graders (0.78, 0.80, and 0.88, for expert grader 1 and 0.72, 0.78, and 0.87, for expert grader 2, respectively), across the ACRIMA, ORIGA, and RIM-ONE datasets. In Glaucoma detection, GPT-4V showed variable agreement by dataset and expert graders, with Cohen kappa values ranging from 0.08 to 0.72. In terms of feature detection, GPT-4V demonstrated high consistency (repeatability) in image gradability, with an agreement accuracy of ≥89% and substantial agreement in rim thinning and cup-to-disc ratio assessments, although kappas were generally lower than expert-to-expert agreement. Conclusions: GPT-4V shows promise as a tool in glaucoma screening and detection through fundus image analysis, demonstrating generally high agreement with expert evaluations of key diagnostic features, although agreement did vary substantially across datasets. Financial Disclosure(s): Proprietary or commercial disclosure may be found in the Footnotes and Disclosures at the end of this article.http://www.sciencedirect.com/science/article/pii/S2666914524002033Artificial intelligenceFundus image analysisGlaucoma detectionGPT-4VLarge multimodal models |
spellingShingle | Jalil Jalili, PhD Anuwat Jiravarnsirikul, MD Christopher Bowd, PhD Benton Chuter, MD Akram Belghith, PhD Michael H. Goldbaum, MD Sally L. Baxter, MD Robert N. Weinreb, MD Linda M. Zangwill, PhD Mark Christopher, PhD Glaucoma Detection and Feature Identification via GPT-4V Fundus Image Analysis Ophthalmology Science Artificial intelligence Fundus image analysis Glaucoma detection GPT-4V Large multimodal models |
title | Glaucoma Detection and Feature Identification via GPT-4V Fundus Image Analysis |
title_full | Glaucoma Detection and Feature Identification via GPT-4V Fundus Image Analysis |
title_fullStr | Glaucoma Detection and Feature Identification via GPT-4V Fundus Image Analysis |
title_full_unstemmed | Glaucoma Detection and Feature Identification via GPT-4V Fundus Image Analysis |
title_short | Glaucoma Detection and Feature Identification via GPT-4V Fundus Image Analysis |
title_sort | glaucoma detection and feature identification via gpt 4v fundus image analysis |
topic | Artificial intelligence Fundus image analysis Glaucoma detection GPT-4V Large multimodal models |
url | http://www.sciencedirect.com/science/article/pii/S2666914524002033 |
work_keys_str_mv | AT jaliljaliliphd glaucomadetectionandfeatureidentificationviagpt4vfundusimageanalysis AT anuwatjiravarnsirikulmd glaucomadetectionandfeatureidentificationviagpt4vfundusimageanalysis AT christopherbowdphd glaucomadetectionandfeatureidentificationviagpt4vfundusimageanalysis AT bentonchutermd glaucomadetectionandfeatureidentificationviagpt4vfundusimageanalysis AT akrambelghithphd glaucomadetectionandfeatureidentificationviagpt4vfundusimageanalysis AT michaelhgoldbaummd glaucomadetectionandfeatureidentificationviagpt4vfundusimageanalysis AT sallylbaxtermd glaucomadetectionandfeatureidentificationviagpt4vfundusimageanalysis AT robertnweinrebmd glaucomadetectionandfeatureidentificationviagpt4vfundusimageanalysis AT lindamzangwillphd glaucomadetectionandfeatureidentificationviagpt4vfundusimageanalysis AT markchristopherphd glaucomadetectionandfeatureidentificationviagpt4vfundusimageanalysis |