Evaluating Quality and Readability of AI-generated Information on Living Kidney Donation

Background. The availability of high-quality and easy-to-read informative material is crucial to providing accurate information to prospective kidney donors. The quality of this information has been associated with the likelihood of proceeding with a living donation. Artificial intelligence–based la...

Full description

Saved in:

Bibliographic Details
Main Authors:	Vincenzo Villani, MD, Hong-Hanh T. Nguyen, NP, Kumaran Shanmugarajah, MD, PhD
Format:	Article
Language:	English
Published:	Wolters Kluwer 2025-01-01
Series:	Transplantation Direct
Online Access:	http://journals.lww.com/transplantationdirect/fulltext/10.1097/TXD.0000000000001740
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1846110333538664448
author	Vincenzo Villani, MD Hong-Hanh T. Nguyen, NP Kumaran Shanmugarajah, MD, PhD
author_facet	Vincenzo Villani, MD Hong-Hanh T. Nguyen, NP Kumaran Shanmugarajah, MD, PhD
author_sort	Vincenzo Villani, MD
collection	DOAJ
description	Background. The availability of high-quality and easy-to-read informative material is crucial to providing accurate information to prospective kidney donors. The quality of this information has been associated with the likelihood of proceeding with a living donation. Artificial intelligence–based large language models (LLMs) have recently become common instruments for acquiring information online, including medical information. The aim of this study was to assess the quality and readability of artificial intelligence-generated information on kidney donation. Methods. A set of 35 common donor questions was developed by the authors and used to interrogate 3 LLMs (ChatGPT, Google Gemini, and MedGPT). Answers were collected and independently evaluated using the CLEAR tool for (1) completeness, (2) lack of false information, (3) evidence-based information, (4) appropriateness, and (5) relevance. Readability was evaluated using the Flesch-Kincaid Reading Ease Score and the Flesch-Kincaid Grade Level. Results. The interrater intraclass correlation was 0.784 (95% confidence interval, 0.716-0.814). Median CLEAR scores were ChatGPT 22 (interquartile range [IQR], 3.67), Google Gemini 24.33 (IQR, 2.33), and MedGPT 23.33 (IQR, 2.00). ChatGPT, Gemini, and MedGPT had mean Flesch-Kincaid Reading Ease Scores of 37.32 (SD = 10.00), 39.42 (SD = 13.49), and 29.66 (SD = 7.94), respectively. Using the Flesch-Kincaid Grade Level assessment, ChatGPT had an average score of 12.29, Gemini had 10.63, and MedGPT had 13.21 (P < 0.001), indicating that all LLMs had a readability at the college-level education. Conclusions. Current LLM provides fairly accurate responses to common prospective living kidney donor questions; however, the generated information is complex and requires an advanced level of education. As LLMs become more relevant in the field of medical information, transplant providers should familiarize themselves with the shortcomings of these technologies.
format	Article
id	doaj-art-86002f89272c4753b1906b6f4ccf3714
institution	Kabale University
issn	2373-8731
language	English
publishDate	2025-01-01
publisher	Wolters Kluwer
record_format	Article
series	Transplantation Direct
spelling	doaj-art-86002f89272c4753b1906b6f4ccf37142024-12-24T09:47:18ZengWolters KluwerTransplantation Direct2373-87312025-01-01111e174010.1097/TXD.0000000000001740202501000-00003Evaluating Quality and Readability of AI-generated Information on Living Kidney DonationVincenzo Villani, MD0Hong-Hanh T. Nguyen, NP1Kumaran Shanmugarajah, MD, PhD21 Division of Immunology and Organ Transplantation, McGovern Medical School at UTHealth Houston, Houston, TX.2 Liver Specialists of Texas, Houston, TX.3 Department of Surgery, Transplantation Center, Digestive Disease and Surgery Institute, Cleveland Clinic, Cleveland, OH.Background. The availability of high-quality and easy-to-read informative material is crucial to providing accurate information to prospective kidney donors. The quality of this information has been associated with the likelihood of proceeding with a living donation. Artificial intelligence–based large language models (LLMs) have recently become common instruments for acquiring information online, including medical information. The aim of this study was to assess the quality and readability of artificial intelligence-generated information on kidney donation. Methods. A set of 35 common donor questions was developed by the authors and used to interrogate 3 LLMs (ChatGPT, Google Gemini, and MedGPT). Answers were collected and independently evaluated using the CLEAR tool for (1) completeness, (2) lack of false information, (3) evidence-based information, (4) appropriateness, and (5) relevance. Readability was evaluated using the Flesch-Kincaid Reading Ease Score and the Flesch-Kincaid Grade Level. Results. The interrater intraclass correlation was 0.784 (95% confidence interval, 0.716-0.814). Median CLEAR scores were ChatGPT 22 (interquartile range [IQR], 3.67), Google Gemini 24.33 (IQR, 2.33), and MedGPT 23.33 (IQR, 2.00). ChatGPT, Gemini, and MedGPT had mean Flesch-Kincaid Reading Ease Scores of 37.32 (SD = 10.00), 39.42 (SD = 13.49), and 29.66 (SD = 7.94), respectively. Using the Flesch-Kincaid Grade Level assessment, ChatGPT had an average score of 12.29, Gemini had 10.63, and MedGPT had 13.21 (P < 0.001), indicating that all LLMs had a readability at the college-level education. Conclusions. Current LLM provides fairly accurate responses to common prospective living kidney donor questions; however, the generated information is complex and requires an advanced level of education. As LLMs become more relevant in the field of medical information, transplant providers should familiarize themselves with the shortcomings of these technologies.http://journals.lww.com/transplantationdirect/fulltext/10.1097/TXD.0000000000001740
spellingShingle	Vincenzo Villani, MD Hong-Hanh T. Nguyen, NP Kumaran Shanmugarajah, MD, PhD Evaluating Quality and Readability of AI-generated Information on Living Kidney Donation Transplantation Direct
title	Evaluating Quality and Readability of AI-generated Information on Living Kidney Donation
title_full	Evaluating Quality and Readability of AI-generated Information on Living Kidney Donation
title_fullStr	Evaluating Quality and Readability of AI-generated Information on Living Kidney Donation
title_full_unstemmed	Evaluating Quality and Readability of AI-generated Information on Living Kidney Donation
title_short	Evaluating Quality and Readability of AI-generated Information on Living Kidney Donation
title_sort	evaluating quality and readability of ai generated information on living kidney donation
url	http://journals.lww.com/transplantationdirect/fulltext/10.1097/TXD.0000000000001740
work_keys_str_mv	AT vincenzovillanimd evaluatingqualityandreadabilityofaigeneratedinformationonlivingkidneydonation AT honghanhtnguyennp evaluatingqualityandreadabilityofaigeneratedinformationonlivingkidneydonation AT kumaranshanmugarajahmdphd evaluatingqualityandreadabilityofaigeneratedinformationonlivingkidneydonation

Evaluating Quality and Readability of AI-generated Information on Living Kidney Donation

Similar Items