Assessing ChatGPT’s Reliability in Endodontics: Implications for AI-Enhanced Clinical Learning

The integration of large language models (LLMs) like ChatGPT is transforming education the health sciences. This study evaluated the applicability of ChatGPT-4 and ChatGPT-4o in endodontics, focusing on their reliability and repeatability in responding to practitioner-level questions. Thirty closed-...

Full description

Saved in:

Bibliographic Details
Main Authors:	María Llorente de Pedro, Ana Suárez, Juan Algar, Víctor Díaz-Flores García, Cristina Andreu-Vázquez, Yolanda Freire
Format:	Article
Language:	English
Published:	MDPI AG 2025-05-01
Series:	Applied Sciences
Subjects:	AI endodontics ChatGPT educational reliability instructional
Online Access:	https://www.mdpi.com/2076-3417/15/10/5231
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1849327605656846336
author	María Llorente de Pedro Ana Suárez Juan Algar Víctor Díaz-Flores García Cristina Andreu-Vázquez Yolanda Freire
author_facet	María Llorente de Pedro Ana Suárez Juan Algar Víctor Díaz-Flores García Cristina Andreu-Vázquez Yolanda Freire
author_sort	María Llorente de Pedro
collection	DOAJ
description	The integration of large language models (LLMs) like ChatGPT is transforming education the health sciences. This study evaluated the applicability of ChatGPT-4 and ChatGPT-4o in endodontics, focusing on their reliability and repeatability in responding to practitioner-level questions. Thirty closed-clinical questions, based on international guidelines, were each submitted thirty times to both models, generating a total of 1800 responses. These responses were evaluated by endodontic experts using a 3-point Likert scale. ChatGPT-4 achieved a reliability score of 52.67%, while ChatGPT-4o slightly outperformed it with 55.22%. Notably, ChatGPT-4o demonstrated greater response consistency, showing superior repeatability metrics such as Gwet’s AC1 and percentage agreement. While both models show promise in supporting learning, ChatGPT-4o may provide more consistent and pedagogically coherent feedback, particularly in contexts where response dependability is essential. From an educational standpoint, the findings support ChatGPT’s potential as a complementary tool for guided study or formative assessment in dentistry. However, due to moderate reliability, unsupervised use in specialized or clinically relevant contexts is not recommended. These insights are valuable for educators and instructional designers seeking to integrate AI into digital pedagogy. Further research should examine the performance of LLMs across diverse disciplines and formats to better define their role in AI-enhanced education.
format	Article
id	doaj-art-fd82d4e88d0249a89e7c52f5d8c579cf
institution	Kabale University
issn	2076-3417
language	English
publishDate	2025-05-01
publisher	MDPI AG
record_format	Article
series	Applied Sciences
spelling	doaj-art-fd82d4e88d0249a89e7c52f5d8c579cf2025-08-20T03:47:49ZengMDPI AGApplied Sciences2076-34172025-05-011510523110.3390/app15105231Assessing ChatGPT’s Reliability in Endodontics: Implications for AI-Enhanced Clinical LearningMaría Llorente de Pedro0Ana Suárez1Juan Algar2Víctor Díaz-Flores García3Cristina Andreu-Vázquez4Yolanda Freire5School for Doctoral Studies and Research, Universidad Europea de Madrid, 28670 Villaviciosa de Odón, SpainDepartment of Preclinical Dentistry II, Faculty of Biomedical and Health Sciences, Universidad Europea de Madrid, 28670 Villaviciosa de Odón, SpainDepartment of Clinical Dentistry-Pregraduate Studies, Faculty of Biomedical and Health Sciences, Universidad Europea de Madrid, 28670 Villaviciosa de Odón, SpainDepartment of Preclinical Dentistry I, Faculty of Biomedical and Health Sciences, Universidad Europea de Madrid, 28670 Villaviciosa de Odón, SpainDepartment of Veterinary, Faculty of Biomedical and Health Sciences, Universidad Europea de Madrid, 28670 Villaviciosa de Odón, SpainDepartment of Preclinical Dentistry II, Faculty of Biomedical and Health Sciences, Universidad Europea de Madrid, 28670 Villaviciosa de Odón, SpainThe integration of large language models (LLMs) like ChatGPT is transforming education the health sciences. This study evaluated the applicability of ChatGPT-4 and ChatGPT-4o in endodontics, focusing on their reliability and repeatability in responding to practitioner-level questions. Thirty closed-clinical questions, based on international guidelines, were each submitted thirty times to both models, generating a total of 1800 responses. These responses were evaluated by endodontic experts using a 3-point Likert scale. ChatGPT-4 achieved a reliability score of 52.67%, while ChatGPT-4o slightly outperformed it with 55.22%. Notably, ChatGPT-4o demonstrated greater response consistency, showing superior repeatability metrics such as Gwet’s AC1 and percentage agreement. While both models show promise in supporting learning, ChatGPT-4o may provide more consistent and pedagogically coherent feedback, particularly in contexts where response dependability is essential. From an educational standpoint, the findings support ChatGPT’s potential as a complementary tool for guided study or formative assessment in dentistry. However, due to moderate reliability, unsupervised use in specialized or clinically relevant contexts is not recommended. These insights are valuable for educators and instructional designers seeking to integrate AI into digital pedagogy. Further research should examine the performance of LLMs across diverse disciplines and formats to better define their role in AI-enhanced education.https://www.mdpi.com/2076-3417/15/10/5231AIendodonticsChatGPTeducational reliabilityinstructional
spellingShingle	María Llorente de Pedro Ana Suárez Juan Algar Víctor Díaz-Flores García Cristina Andreu-Vázquez Yolanda Freire Assessing ChatGPT’s Reliability in Endodontics: Implications for AI-Enhanced Clinical Learning Applied Sciences AI endodontics ChatGPT educational reliability instructional
title	Assessing ChatGPT’s Reliability in Endodontics: Implications for AI-Enhanced Clinical Learning
title_full	Assessing ChatGPT’s Reliability in Endodontics: Implications for AI-Enhanced Clinical Learning
title_fullStr	Assessing ChatGPT’s Reliability in Endodontics: Implications for AI-Enhanced Clinical Learning
title_full_unstemmed	Assessing ChatGPT’s Reliability in Endodontics: Implications for AI-Enhanced Clinical Learning
title_short	Assessing ChatGPT’s Reliability in Endodontics: Implications for AI-Enhanced Clinical Learning
title_sort	assessing chatgpt s reliability in endodontics implications for ai enhanced clinical learning
topic	AI endodontics ChatGPT educational reliability instructional
url	https://www.mdpi.com/2076-3417/15/10/5231
work_keys_str_mv	AT mariallorentedepedro assessingchatgptsreliabilityinendodonticsimplicationsforaienhancedclinicallearning AT anasuarez assessingchatgptsreliabilityinendodonticsimplicationsforaienhancedclinicallearning AT juanalgar assessingchatgptsreliabilityinendodonticsimplicationsforaienhancedclinicallearning AT victordiazfloresgarcia assessingchatgptsreliabilityinendodonticsimplicationsforaienhancedclinicallearning AT cristinaandreuvazquez assessingchatgptsreliabilityinendodonticsimplicationsforaienhancedclinicallearning AT yolandafreire assessingchatgptsreliabilityinendodonticsimplicationsforaienhancedclinicallearning

Assessing ChatGPT’s Reliability in Endodontics: Implications for AI-Enhanced Clinical Learning

Similar Items