The Ability of Large Language Models to Generate Patient Information Materials for Retinopathy of Prematurity: Evaluation of Readability, Accuracy, and Comprehensiveness

Objectives: This study compared the readability of patient education materials from the Turkish Ophthalmological Association (TOA) retinopathy of prematurity (ROP) guidelines with those generated by large language models (LLMs). The ability of GPT-4.0, GPT-4o mini, and Gemini to produce patient educ...

Full description

Saved in:
Bibliographic Details
Main Authors: Sevinç Arzu Postacı, Ali Dal
Format: Article
Language:English
Published: Galenos Yayinevi 2024-12-01
Series:Türk Oftalmoloji Dergisi
Subjects:
Online Access:https://www.oftalmoloji.org/articles/the-ability-of-large-language-models-to-generate-patient-information-materials-for-retinopathy-of-prematurity-evaluation-of-readability-accuracy-and-comprehensiveness/doi/tjo.galenos.2024.58295
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1841556515503210496
author Sevinç Arzu Postacı
Ali Dal
author_facet Sevinç Arzu Postacı
Ali Dal
author_sort Sevinç Arzu Postacı
collection DOAJ
description Objectives: This study compared the readability of patient education materials from the Turkish Ophthalmological Association (TOA) retinopathy of prematurity (ROP) guidelines with those generated by large language models (LLMs). The ability of GPT-4.0, GPT-4o mini, and Gemini to produce patient education materials was evaluated in terms of accuracy and comprehensiveness. Materials and Methods: Thirty questions from the TOA ROP guidelines were posed to GPT-4.0, GPT-4o mini, and Gemini. Their responses were then reformulated using the prompts “Can you revise this text to be understandable at a 6th-grade reading level?” (P1 format) and “Can you make this text easier to understand?” (P2 format). The readability of the TOA ROP guidelines and the LLM-generated responses was analyzed using the Ateşman and Bezirci-Yılmaz formulas. Additionally, ROP specialists evaluated the comprehensiveness and accuracy of the responses. Results: The TOA brochure was found to have a reading level above the 6th-grade level recommended in the literature. Materials generated by GPT-4.0 and Gemini had significantly greater readability than the TOA brochure (p<0.05). Adjustments made in the P1 and P2 formats improved readability for GPT-4.0, while no significant change was observed for GPT-4o mini and Gemini. GPT-4.0 had the highest scores for accuracy and comprehensiveness, while Gemini had the lowest. Conclusion: GPT-4.0 appeared to have greater potential for generating more readable, accurate, and comprehensive patient education materials. However, when integrating LLMs into the healthcare field, regional medical differences and the accuracy of the provided information must be carefully assessed.
format Article
id doaj-art-9b486c184d904320917a8538cbf27a0c
institution Kabale University
issn 1300-0659
2147-2661
language English
publishDate 2024-12-01
publisher Galenos Yayinevi
record_format Article
series Türk Oftalmoloji Dergisi
spelling doaj-art-9b486c184d904320917a8538cbf27a0c2025-01-07T08:17:57ZengGalenos YayineviTürk Oftalmoloji Dergisi1300-06592147-26612024-12-0154633033610.4274/tjo.galenos.2024.58295The Ability of Large Language Models to Generate Patient Information Materials for Retinopathy of Prematurity: Evaluation of Readability, Accuracy, and ComprehensivenessSevinç Arzu Postacı0https://orcid.org/0000-0002-3778-6583Ali Dal1https://orcid.org/0000-0002-0748-6416Mustafa Kemal University Tayfur Sökmen Faculty of Medicine, Department of Ophthalmology, Hatay, TürkiyeMustafa Kemal University Tayfur Sökmen Faculty of Medicine, Department of Ophthalmology, Hatay, TürkiyeObjectives: This study compared the readability of patient education materials from the Turkish Ophthalmological Association (TOA) retinopathy of prematurity (ROP) guidelines with those generated by large language models (LLMs). The ability of GPT-4.0, GPT-4o mini, and Gemini to produce patient education materials was evaluated in terms of accuracy and comprehensiveness. Materials and Methods: Thirty questions from the TOA ROP guidelines were posed to GPT-4.0, GPT-4o mini, and Gemini. Their responses were then reformulated using the prompts “Can you revise this text to be understandable at a 6th-grade reading level?” (P1 format) and “Can you make this text easier to understand?” (P2 format). The readability of the TOA ROP guidelines and the LLM-generated responses was analyzed using the Ateşman and Bezirci-Yılmaz formulas. Additionally, ROP specialists evaluated the comprehensiveness and accuracy of the responses. Results: The TOA brochure was found to have a reading level above the 6th-grade level recommended in the literature. Materials generated by GPT-4.0 and Gemini had significantly greater readability than the TOA brochure (p<0.05). Adjustments made in the P1 and P2 formats improved readability for GPT-4.0, while no significant change was observed for GPT-4o mini and Gemini. GPT-4.0 had the highest scores for accuracy and comprehensiveness, while Gemini had the lowest. Conclusion: GPT-4.0 appeared to have greater potential for generating more readable, accurate, and comprehensive patient education materials. However, when integrating LLMs into the healthcare field, regional medical differences and the accuracy of the provided information must be carefully assessed.https://www.oftalmoloji.org/articles/the-ability-of-large-language-models-to-generate-patient-information-materials-for-retinopathy-of-prematurity-evaluation-of-readability-accuracy-and-comprehensiveness/doi/tjo.galenos.2024.58295retinopathy of prematuritylarge language modelsreadabilitypatient education
spellingShingle Sevinç Arzu Postacı
Ali Dal
The Ability of Large Language Models to Generate Patient Information Materials for Retinopathy of Prematurity: Evaluation of Readability, Accuracy, and Comprehensiveness
Türk Oftalmoloji Dergisi
retinopathy of prematurity
large language models
readability
patient education
title The Ability of Large Language Models to Generate Patient Information Materials for Retinopathy of Prematurity: Evaluation of Readability, Accuracy, and Comprehensiveness
title_full The Ability of Large Language Models to Generate Patient Information Materials for Retinopathy of Prematurity: Evaluation of Readability, Accuracy, and Comprehensiveness
title_fullStr The Ability of Large Language Models to Generate Patient Information Materials for Retinopathy of Prematurity: Evaluation of Readability, Accuracy, and Comprehensiveness
title_full_unstemmed The Ability of Large Language Models to Generate Patient Information Materials for Retinopathy of Prematurity: Evaluation of Readability, Accuracy, and Comprehensiveness
title_short The Ability of Large Language Models to Generate Patient Information Materials for Retinopathy of Prematurity: Evaluation of Readability, Accuracy, and Comprehensiveness
title_sort ability of large language models to generate patient information materials for retinopathy of prematurity evaluation of readability accuracy and comprehensiveness
topic retinopathy of prematurity
large language models
readability
patient education
url https://www.oftalmoloji.org/articles/the-ability-of-large-language-models-to-generate-patient-information-materials-for-retinopathy-of-prematurity-evaluation-of-readability-accuracy-and-comprehensiveness/doi/tjo.galenos.2024.58295
work_keys_str_mv AT sevincarzupostacı theabilityoflargelanguagemodelstogeneratepatientinformationmaterialsforretinopathyofprematurityevaluationofreadabilityaccuracyandcomprehensiveness
AT alidal theabilityoflargelanguagemodelstogeneratepatientinformationmaterialsforretinopathyofprematurityevaluationofreadabilityaccuracyandcomprehensiveness
AT sevincarzupostacı abilityoflargelanguagemodelstogeneratepatientinformationmaterialsforretinopathyofprematurityevaluationofreadabilityaccuracyandcomprehensiveness
AT alidal abilityoflargelanguagemodelstogeneratepatientinformationmaterialsforretinopathyofprematurityevaluationofreadabilityaccuracyandcomprehensiveness