Can Large Language Models Aid Caregivers of Pediatric Cancer Patients in Information Seeking? A Cross‐Sectional Investigation
ABSTRACT Purpose Caregivers in pediatric oncology need accurate and understandable information about their child's condition, treatment, and side effects. This study assesses the performance of publicly accessible large language model (LLM)‐supported tools in providing valuable and reliable inf...
Saved in:
Main Authors: | , , , , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Wiley
2025-01-01
|
Series: | Cancer Medicine |
Subjects: | |
Online Access: | https://doi.org/10.1002/cam4.70554 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
_version_ | 1841543413922529280 |
---|---|
author | Emre Sezgin Daniel I. Jackson A. Baki Kocaballi Mindy Bibart Sue Zupanec Wendy Landier Anthony Audino Mark Ranalli Micah Skeens |
author_facet | Emre Sezgin Daniel I. Jackson A. Baki Kocaballi Mindy Bibart Sue Zupanec Wendy Landier Anthony Audino Mark Ranalli Micah Skeens |
author_sort | Emre Sezgin |
collection | DOAJ |
description | ABSTRACT Purpose Caregivers in pediatric oncology need accurate and understandable information about their child's condition, treatment, and side effects. This study assesses the performance of publicly accessible large language model (LLM)‐supported tools in providing valuable and reliable information to caregivers of children with cancer. Methods In this cross‐sectional study, we evaluated the performance of the four LLM‐supported tools—ChatGPT (GPT‐4), Google Bard (Gemini Pro), Microsoft Bing Chat, and Google SGE—against a set of frequently asked questions (FAQs) derived from the Children's Oncology Group Family Handbook and expert input (In total, 26 FAQs and 104 generated responses). Five pediatric oncology experts assessed the generated LLM responses using measures including accuracy, clarity, inclusivity, completeness, clinical utility, and overall rating. Additionally, the content quality was evaluated including readability, AI disclosure, source credibility, resource matching, and content originality. We used descriptive analysis and statistical tests including Shapiro–Wilk, Levene's, Kruskal–Wallis H‐tests, and Dunn's post hoc tests for pairwise comparisons. Results ChatGPT shows high overall performance when evaluated by the experts. Bard also performed well, especially in accuracy and clarity of the responses, whereas Bing Chat and Google SGE had lower overall scores. Regarding the disclosure of responses being generated by AI, it was observed less frequently in ChatGPT responses, which may have affected the clarity of responses, whereas Bard maintained a balance between AI disclosure and response clarity. Google SGE generated the most readable responses whereas ChatGPT answered with the most complexity. LLM tools varied significantly (p < 0.001) across all expert evaluations except inclusivity. Through our thematic analysis of expert free‐text comments, emotional tone and empathy emerged as a unique theme with mixed feedback on expectations from AI to be empathetic. Conclusion LLM‐supported tools can enhance caregivers' knowledge of pediatric oncology. Each model has unique strengths and areas for improvement, indicating the need for careful selection based on specific clinical contexts. Further research is required to explore their application in other medical specialties and patient demographics, assessing broader applicability and long‐term impacts. |
format | Article |
id | doaj-art-97bc95d6225a49cc9216fec416fe7a06 |
institution | Kabale University |
issn | 2045-7634 |
language | English |
publishDate | 2025-01-01 |
publisher | Wiley |
record_format | Article |
series | Cancer Medicine |
spelling | doaj-art-97bc95d6225a49cc9216fec416fe7a062025-01-13T13:22:39ZengWileyCancer Medicine2045-76342025-01-01141n/an/a10.1002/cam4.70554Can Large Language Models Aid Caregivers of Pediatric Cancer Patients in Information Seeking? A Cross‐Sectional InvestigationEmre Sezgin0Daniel I. Jackson1A. Baki Kocaballi2Mindy Bibart3Sue Zupanec4Wendy Landier5Anthony Audino6Mark Ranalli7Micah Skeens8The Abigail Wexner Research Institute at Nationwide Children's Hospital Columbus Ohio USAThe Abigail Wexner Research Institute at Nationwide Children's Hospital Columbus Ohio USACentre for Health Informatics Australian Institute of Health Innovation Macquarie University Sydney AustraliaDivision of Hematology/Oncology Nationwide Children's Hospital Columbus Ohio USAHematology/Oncology Department Hospital for Sick Children (Sick Kids) Toronto Ontario CanadaInstitute for Cancer Outcomes and Survivorship, School of Medicine University of Alabama at Birmingham School of Medicine Birmingham Alabama USAThe Abigail Wexner Research Institute at Nationwide Children's Hospital Columbus Ohio USAThe Abigail Wexner Research Institute at Nationwide Children's Hospital Columbus Ohio USAThe Abigail Wexner Research Institute at Nationwide Children's Hospital Columbus Ohio USAABSTRACT Purpose Caregivers in pediatric oncology need accurate and understandable information about their child's condition, treatment, and side effects. This study assesses the performance of publicly accessible large language model (LLM)‐supported tools in providing valuable and reliable information to caregivers of children with cancer. Methods In this cross‐sectional study, we evaluated the performance of the four LLM‐supported tools—ChatGPT (GPT‐4), Google Bard (Gemini Pro), Microsoft Bing Chat, and Google SGE—against a set of frequently asked questions (FAQs) derived from the Children's Oncology Group Family Handbook and expert input (In total, 26 FAQs and 104 generated responses). Five pediatric oncology experts assessed the generated LLM responses using measures including accuracy, clarity, inclusivity, completeness, clinical utility, and overall rating. Additionally, the content quality was evaluated including readability, AI disclosure, source credibility, resource matching, and content originality. We used descriptive analysis and statistical tests including Shapiro–Wilk, Levene's, Kruskal–Wallis H‐tests, and Dunn's post hoc tests for pairwise comparisons. Results ChatGPT shows high overall performance when evaluated by the experts. Bard also performed well, especially in accuracy and clarity of the responses, whereas Bing Chat and Google SGE had lower overall scores. Regarding the disclosure of responses being generated by AI, it was observed less frequently in ChatGPT responses, which may have affected the clarity of responses, whereas Bard maintained a balance between AI disclosure and response clarity. Google SGE generated the most readable responses whereas ChatGPT answered with the most complexity. LLM tools varied significantly (p < 0.001) across all expert evaluations except inclusivity. Through our thematic analysis of expert free‐text comments, emotional tone and empathy emerged as a unique theme with mixed feedback on expectations from AI to be empathetic. Conclusion LLM‐supported tools can enhance caregivers' knowledge of pediatric oncology. Each model has unique strengths and areas for improvement, indicating the need for careful selection based on specific clinical contexts. Further research is required to explore their application in other medical specialties and patient demographics, assessing broader applicability and long‐term impacts.https://doi.org/10.1002/cam4.70554artificial intelligencehealth care communicationhealth literacylarge language modelspatient educationpediatric oncology |
spellingShingle | Emre Sezgin Daniel I. Jackson A. Baki Kocaballi Mindy Bibart Sue Zupanec Wendy Landier Anthony Audino Mark Ranalli Micah Skeens Can Large Language Models Aid Caregivers of Pediatric Cancer Patients in Information Seeking? A Cross‐Sectional Investigation Cancer Medicine artificial intelligence health care communication health literacy large language models patient education pediatric oncology |
title | Can Large Language Models Aid Caregivers of Pediatric Cancer Patients in Information Seeking? A Cross‐Sectional Investigation |
title_full | Can Large Language Models Aid Caregivers of Pediatric Cancer Patients in Information Seeking? A Cross‐Sectional Investigation |
title_fullStr | Can Large Language Models Aid Caregivers of Pediatric Cancer Patients in Information Seeking? A Cross‐Sectional Investigation |
title_full_unstemmed | Can Large Language Models Aid Caregivers of Pediatric Cancer Patients in Information Seeking? A Cross‐Sectional Investigation |
title_short | Can Large Language Models Aid Caregivers of Pediatric Cancer Patients in Information Seeking? A Cross‐Sectional Investigation |
title_sort | can large language models aid caregivers of pediatric cancer patients in information seeking a cross sectional investigation |
topic | artificial intelligence health care communication health literacy large language models patient education pediatric oncology |
url | https://doi.org/10.1002/cam4.70554 |
work_keys_str_mv | AT emresezgin canlargelanguagemodelsaidcaregiversofpediatriccancerpatientsininformationseekingacrosssectionalinvestigation AT danielijackson canlargelanguagemodelsaidcaregiversofpediatriccancerpatientsininformationseekingacrosssectionalinvestigation AT abakikocaballi canlargelanguagemodelsaidcaregiversofpediatriccancerpatientsininformationseekingacrosssectionalinvestigation AT mindybibart canlargelanguagemodelsaidcaregiversofpediatriccancerpatientsininformationseekingacrosssectionalinvestigation AT suezupanec canlargelanguagemodelsaidcaregiversofpediatriccancerpatientsininformationseekingacrosssectionalinvestigation AT wendylandier canlargelanguagemodelsaidcaregiversofpediatriccancerpatientsininformationseekingacrosssectionalinvestigation AT anthonyaudino canlargelanguagemodelsaidcaregiversofpediatriccancerpatientsininformationseekingacrosssectionalinvestigation AT markranalli canlargelanguagemodelsaidcaregiversofpediatriccancerpatientsininformationseekingacrosssectionalinvestigation AT micahskeens canlargelanguagemodelsaidcaregiversofpediatriccancerpatientsininformationseekingacrosssectionalinvestigation |