Evaluating Quality and Readability of AI-generated Information on Living Kidney Donation

Background. The availability of high-quality and easy-to-read informative material is crucial to providing accurate information to prospective kidney donors. The quality of this information has been associated with the likelihood of proceeding with a living donation. Artificial intelligence–based la...

Full description

Saved in:
Bibliographic Details
Main Authors: Vincenzo Villani, MD, Hong-Hanh T. Nguyen, NP, Kumaran Shanmugarajah, MD, PhD
Format: Article
Language:English
Published: Wolters Kluwer 2025-01-01
Series:Transplantation Direct
Online Access:http://journals.lww.com/transplantationdirect/fulltext/10.1097/TXD.0000000000001740
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1846110333538664448
author Vincenzo Villani, MD
Hong-Hanh T. Nguyen, NP
Kumaran Shanmugarajah, MD, PhD
author_facet Vincenzo Villani, MD
Hong-Hanh T. Nguyen, NP
Kumaran Shanmugarajah, MD, PhD
author_sort Vincenzo Villani, MD
collection DOAJ
description Background. The availability of high-quality and easy-to-read informative material is crucial to providing accurate information to prospective kidney donors. The quality of this information has been associated with the likelihood of proceeding with a living donation. Artificial intelligence–based large language models (LLMs) have recently become common instruments for acquiring information online, including medical information. The aim of this study was to assess the quality and readability of artificial intelligence-generated information on kidney donation. Methods. A set of 35 common donor questions was developed by the authors and used to interrogate 3 LLMs (ChatGPT, Google Gemini, and MedGPT). Answers were collected and independently evaluated using the CLEAR tool for (1) completeness, (2) lack of false information, (3) evidence-based information, (4) appropriateness, and (5) relevance. Readability was evaluated using the Flesch-Kincaid Reading Ease Score and the Flesch-Kincaid Grade Level. Results. The interrater intraclass correlation was 0.784 (95% confidence interval, 0.716-0.814). Median CLEAR scores were ChatGPT 22 (interquartile range [IQR], 3.67), Google Gemini 24.33 (IQR, 2.33), and MedGPT 23.33 (IQR, 2.00). ChatGPT, Gemini, and MedGPT had mean Flesch-Kincaid Reading Ease Scores of 37.32 (SD = 10.00), 39.42 (SD = 13.49), and 29.66 (SD = 7.94), respectively. Using the Flesch-Kincaid Grade Level assessment, ChatGPT had an average score of 12.29, Gemini had 10.63, and MedGPT had 13.21 (P < 0.001), indicating that all LLMs had a readability at the college-level education. Conclusions. Current LLM provides fairly accurate responses to common prospective living kidney donor questions; however, the generated information is complex and requires an advanced level of education. As LLMs become more relevant in the field of medical information, transplant providers should familiarize themselves with the shortcomings of these technologies.
format Article
id doaj-art-86002f89272c4753b1906b6f4ccf3714
institution Kabale University
issn 2373-8731
language English
publishDate 2025-01-01
publisher Wolters Kluwer
record_format Article
series Transplantation Direct
spelling doaj-art-86002f89272c4753b1906b6f4ccf37142024-12-24T09:47:18ZengWolters KluwerTransplantation Direct2373-87312025-01-01111e174010.1097/TXD.0000000000001740202501000-00003Evaluating Quality and Readability of AI-generated Information on Living Kidney DonationVincenzo Villani, MD0Hong-Hanh T. Nguyen, NP1Kumaran Shanmugarajah, MD, PhD21 Division of Immunology and Organ Transplantation, McGovern Medical School at UTHealth Houston, Houston, TX.2 Liver Specialists of Texas, Houston, TX.3 Department of Surgery, Transplantation Center, Digestive Disease and Surgery Institute, Cleveland Clinic, Cleveland, OH.Background. The availability of high-quality and easy-to-read informative material is crucial to providing accurate information to prospective kidney donors. The quality of this information has been associated with the likelihood of proceeding with a living donation. Artificial intelligence–based large language models (LLMs) have recently become common instruments for acquiring information online, including medical information. The aim of this study was to assess the quality and readability of artificial intelligence-generated information on kidney donation. Methods. A set of 35 common donor questions was developed by the authors and used to interrogate 3 LLMs (ChatGPT, Google Gemini, and MedGPT). Answers were collected and independently evaluated using the CLEAR tool for (1) completeness, (2) lack of false information, (3) evidence-based information, (4) appropriateness, and (5) relevance. Readability was evaluated using the Flesch-Kincaid Reading Ease Score and the Flesch-Kincaid Grade Level. Results. The interrater intraclass correlation was 0.784 (95% confidence interval, 0.716-0.814). Median CLEAR scores were ChatGPT 22 (interquartile range [IQR], 3.67), Google Gemini 24.33 (IQR, 2.33), and MedGPT 23.33 (IQR, 2.00). ChatGPT, Gemini, and MedGPT had mean Flesch-Kincaid Reading Ease Scores of 37.32 (SD = 10.00), 39.42 (SD = 13.49), and 29.66 (SD = 7.94), respectively. Using the Flesch-Kincaid Grade Level assessment, ChatGPT had an average score of 12.29, Gemini had 10.63, and MedGPT had 13.21 (P < 0.001), indicating that all LLMs had a readability at the college-level education. Conclusions. Current LLM provides fairly accurate responses to common prospective living kidney donor questions; however, the generated information is complex and requires an advanced level of education. As LLMs become more relevant in the field of medical information, transplant providers should familiarize themselves with the shortcomings of these technologies.http://journals.lww.com/transplantationdirect/fulltext/10.1097/TXD.0000000000001740
spellingShingle Vincenzo Villani, MD
Hong-Hanh T. Nguyen, NP
Kumaran Shanmugarajah, MD, PhD
Evaluating Quality and Readability of AI-generated Information on Living Kidney Donation
Transplantation Direct
title Evaluating Quality and Readability of AI-generated Information on Living Kidney Donation
title_full Evaluating Quality and Readability of AI-generated Information on Living Kidney Donation
title_fullStr Evaluating Quality and Readability of AI-generated Information on Living Kidney Donation
title_full_unstemmed Evaluating Quality and Readability of AI-generated Information on Living Kidney Donation
title_short Evaluating Quality and Readability of AI-generated Information on Living Kidney Donation
title_sort evaluating quality and readability of ai generated information on living kidney donation
url http://journals.lww.com/transplantationdirect/fulltext/10.1097/TXD.0000000000001740
work_keys_str_mv AT vincenzovillanimd evaluatingqualityandreadabilityofaigeneratedinformationonlivingkidneydonation
AT honghanhtnguyennp evaluatingqualityandreadabilityofaigeneratedinformationonlivingkidneydonation
AT kumaranshanmugarajahmdphd evaluatingqualityandreadabilityofaigeneratedinformationonlivingkidneydonation