Enhancing Multilingual Patient Education: ChatGPT's Accuracy and Readability for SSNHL Queries in English and Spanish

Abstract Objective This study investigates ChatGPT's accuracy, readability, understandability, and actionability in responding to patient queries on sudden sensorineural hearing loss (SSNHL) in English and Spanish, when compared to Google responses. The objective is to address concerns regardin...

Full description

Saved in:
Bibliographic Details
Main Authors: Emily Ajit‐Roger, Alexander Moise, Carolina Peralta, Ostap Orishchak, Sam J. Daniel
Format: Article
Language:English
Published: Wiley 2024-10-01
Series:OTO Open
Subjects:
Online Access:https://doi.org/10.1002/oto2.70048
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Abstract Objective This study investigates ChatGPT's accuracy, readability, understandability, and actionability in responding to patient queries on sudden sensorineural hearing loss (SSNHL) in English and Spanish, when compared to Google responses. The objective is to address concerns regarding its proficiency in addressing medical inquiries when presented in a language divergent from its primary programming. Study Design Observational. Setting Virtual environment. Methods Using ChatGPT 3.5 and Google, questions from the AAO‐HNSF guidelines were presented in English and Spanish. Responses were graded by 2 otolaryngologists proficient in both languages using a 4‐point Likert scale and the PEMAT‐P tool. To ensure uniform application of the Likert scale, a third independent evaluator reviewed the consistency in grading. Readability was evaluated using 3 different tools specific to each language. IBM SPSS Version 29 was used for statistical analysis using one‐way analysis of variance. Results Across both languages, the responses displayed a native‐level language proficiency. Accuracy was comparable between sources and languages. Google's Spanish responses had better readability (effect size 0.35, P < .001), while Google's English responses were more understandable (effect size 0.67, P = .018). ChatGPT's English responses demonstrated the highest level of actionability (60%), though not significantly different when compared to other sources (effect size 0.47, P = .14). Conclusion ChatGPT offers patients comprehensive and guideline‐conforming answers to SSNHL patient medical queries in the 2 most spoken languages in the United States. However, improvements in its readability and understandability are warranted for more accessible patient education.
ISSN:2473-974X