Evaluating Quality and Readability of AI-generated Information on Living Kidney Donation
Background. The availability of high-quality and easy-to-read informative material is crucial to providing accurate information to prospective kidney donors. The quality of this information has been associated with the likelihood of proceeding with a living donation. Artificial intelligence–based la...
Saved in:
| Main Authors: | , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
Wolters Kluwer
2025-01-01
|
| Series: | Transplantation Direct |
| Online Access: | http://journals.lww.com/transplantationdirect/fulltext/10.1097/TXD.0000000000001740 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1846110333538664448 |
|---|---|
| author | Vincenzo Villani, MD Hong-Hanh T. Nguyen, NP Kumaran Shanmugarajah, MD, PhD |
| author_facet | Vincenzo Villani, MD Hong-Hanh T. Nguyen, NP Kumaran Shanmugarajah, MD, PhD |
| author_sort | Vincenzo Villani, MD |
| collection | DOAJ |
| description | Background. The availability of high-quality and easy-to-read informative material is crucial to providing accurate information to prospective kidney donors. The quality of this information has been associated with the likelihood of proceeding with a living donation. Artificial intelligence–based large language models (LLMs) have recently become common instruments for acquiring information online, including medical information. The aim of this study was to assess the quality and readability of artificial intelligence-generated information on kidney donation.
Methods. A set of 35 common donor questions was developed by the authors and used to interrogate 3 LLMs (ChatGPT, Google Gemini, and MedGPT). Answers were collected and independently evaluated using the CLEAR tool for (1) completeness, (2) lack of false information, (3) evidence-based information, (4) appropriateness, and (5) relevance. Readability was evaluated using the Flesch-Kincaid Reading Ease Score and the Flesch-Kincaid Grade Level.
Results. The interrater intraclass correlation was 0.784 (95% confidence interval, 0.716-0.814). Median CLEAR scores were ChatGPT 22 (interquartile range [IQR], 3.67), Google Gemini 24.33 (IQR, 2.33), and MedGPT 23.33 (IQR, 2.00). ChatGPT, Gemini, and MedGPT had mean Flesch-Kincaid Reading Ease Scores of 37.32 (SD = 10.00), 39.42 (SD = 13.49), and 29.66 (SD = 7.94), respectively. Using the Flesch-Kincaid Grade Level assessment, ChatGPT had an average score of 12.29, Gemini had 10.63, and MedGPT had 13.21 (P < 0.001), indicating that all LLMs had a readability at the college-level education.
Conclusions. Current LLM provides fairly accurate responses to common prospective living kidney donor questions; however, the generated information is complex and requires an advanced level of education. As LLMs become more relevant in the field of medical information, transplant providers should familiarize themselves with the shortcomings of these technologies. |
| format | Article |
| id | doaj-art-86002f89272c4753b1906b6f4ccf3714 |
| institution | Kabale University |
| issn | 2373-8731 |
| language | English |
| publishDate | 2025-01-01 |
| publisher | Wolters Kluwer |
| record_format | Article |
| series | Transplantation Direct |
| spelling | doaj-art-86002f89272c4753b1906b6f4ccf37142024-12-24T09:47:18ZengWolters KluwerTransplantation Direct2373-87312025-01-01111e174010.1097/TXD.0000000000001740202501000-00003Evaluating Quality and Readability of AI-generated Information on Living Kidney DonationVincenzo Villani, MD0Hong-Hanh T. Nguyen, NP1Kumaran Shanmugarajah, MD, PhD21 Division of Immunology and Organ Transplantation, McGovern Medical School at UTHealth Houston, Houston, TX.2 Liver Specialists of Texas, Houston, TX.3 Department of Surgery, Transplantation Center, Digestive Disease and Surgery Institute, Cleveland Clinic, Cleveland, OH.Background. The availability of high-quality and easy-to-read informative material is crucial to providing accurate information to prospective kidney donors. The quality of this information has been associated with the likelihood of proceeding with a living donation. Artificial intelligence–based large language models (LLMs) have recently become common instruments for acquiring information online, including medical information. The aim of this study was to assess the quality and readability of artificial intelligence-generated information on kidney donation. Methods. A set of 35 common donor questions was developed by the authors and used to interrogate 3 LLMs (ChatGPT, Google Gemini, and MedGPT). Answers were collected and independently evaluated using the CLEAR tool for (1) completeness, (2) lack of false information, (3) evidence-based information, (4) appropriateness, and (5) relevance. Readability was evaluated using the Flesch-Kincaid Reading Ease Score and the Flesch-Kincaid Grade Level. Results. The interrater intraclass correlation was 0.784 (95% confidence interval, 0.716-0.814). Median CLEAR scores were ChatGPT 22 (interquartile range [IQR], 3.67), Google Gemini 24.33 (IQR, 2.33), and MedGPT 23.33 (IQR, 2.00). ChatGPT, Gemini, and MedGPT had mean Flesch-Kincaid Reading Ease Scores of 37.32 (SD = 10.00), 39.42 (SD = 13.49), and 29.66 (SD = 7.94), respectively. Using the Flesch-Kincaid Grade Level assessment, ChatGPT had an average score of 12.29, Gemini had 10.63, and MedGPT had 13.21 (P < 0.001), indicating that all LLMs had a readability at the college-level education. Conclusions. Current LLM provides fairly accurate responses to common prospective living kidney donor questions; however, the generated information is complex and requires an advanced level of education. As LLMs become more relevant in the field of medical information, transplant providers should familiarize themselves with the shortcomings of these technologies.http://journals.lww.com/transplantationdirect/fulltext/10.1097/TXD.0000000000001740 |
| spellingShingle | Vincenzo Villani, MD Hong-Hanh T. Nguyen, NP Kumaran Shanmugarajah, MD, PhD Evaluating Quality and Readability of AI-generated Information on Living Kidney Donation Transplantation Direct |
| title | Evaluating Quality and Readability of AI-generated Information on Living Kidney Donation |
| title_full | Evaluating Quality and Readability of AI-generated Information on Living Kidney Donation |
| title_fullStr | Evaluating Quality and Readability of AI-generated Information on Living Kidney Donation |
| title_full_unstemmed | Evaluating Quality and Readability of AI-generated Information on Living Kidney Donation |
| title_short | Evaluating Quality and Readability of AI-generated Information on Living Kidney Donation |
| title_sort | evaluating quality and readability of ai generated information on living kidney donation |
| url | http://journals.lww.com/transplantationdirect/fulltext/10.1097/TXD.0000000000001740 |
| work_keys_str_mv | AT vincenzovillanimd evaluatingqualityandreadabilityofaigeneratedinformationonlivingkidneydonation AT honghanhtnguyennp evaluatingqualityandreadabilityofaigeneratedinformationonlivingkidneydonation AT kumaranshanmugarajahmdphd evaluatingqualityandreadabilityofaigeneratedinformationonlivingkidneydonation |