Comparison of the performances between ChatGPT and Gemini in answering questions on viral hepatitis

Abstract This is the first study to evaluate the adequacy and reliability of the ChatGPT and Gemini chatbots on viral hepatitis. A total of 176 questions were composed from three different categories. The first group includes “questions and answers (Q&As) for the public” determined by the Center...

Full description

Saved in:

Bibliographic Details
Main Authors:	Meryem Sahin Ozdemir, Yusuf Emre Ozdemir
Format:	Article
Language:	English
Published:	Nature Portfolio 2025-01-01
Series:	Scientific Reports
Subjects:	ChatGPT Gemini Viral hepatitis Hepatitis B Hepatitis C
Online Access:	https://doi.org/10.1038/s41598-024-83575-1
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1841544718129823744
author	Meryem Sahin Ozdemir Yusuf Emre Ozdemir
author_facet	Meryem Sahin Ozdemir Yusuf Emre Ozdemir
author_sort	Meryem Sahin Ozdemir
collection	DOAJ
description	Abstract This is the first study to evaluate the adequacy and reliability of the ChatGPT and Gemini chatbots on viral hepatitis. A total of 176 questions were composed from three different categories. The first group includes “questions and answers (Q&As) for the public” determined by the Centers for Disease Control and Prevention (CDC). The second group includes strong recommendations of international guidelines. The third group includes frequently asked questions on social media platforms. The answers of the chatbots were evaluated by two different infectious diseases specialists on a scoring scale from 1 to 4. Cohen’s kappa coefficient was calculated to assess inter-rater reliability. The reproducibility and correlation of answers generated by ChatGPT and Gemini were analyzed. ChatGPT and Gemini’s mean scores (3.55 ± 0.83 vs. 3.57 ± 0.89, p = 0.260) and completely correct response rates (71.0% vs. 78.4%, p = 0.111) were similar. In addition, in subgroup analyses with the CDC questions Sect. (90.1% vs. 91.9%, p = 0.752), the guideline questions Sect. (49.4% vs. 61.4%, p = 0.140), and the social media platform questions Sect. (82.5% vs. 90%, p = 0.335), the completely correct answers rates were similar. There was a moderate positive correlation between ChatGPT and Gemini chatbots’ answers (r = 0.633, p < 0.001). Reproducibility rates of answers to questions were 91.3% in ChatGPT and 92% in Gemini (p = 0.710). According to Cohen’s kappa test, there was a substantial inter-rater agreement for both ChatGPT (κ = 0.720) and Gemini (κ = 0.704). ChatGPT and Gemini successfully answered CDC questions and social media platform questions, but the correct answer rates were insufficient for guideline questions.
format	Article
id	doaj-art-1364cc00e84640e59175ec8b0457a140
institution	Kabale University
issn	2045-2322
language	English
publishDate	2025-01-01
publisher	Nature Portfolio
record_format	Article
series	Scientific Reports
spelling	doaj-art-1364cc00e84640e59175ec8b0457a1402025-01-12T12:19:19ZengNature PortfolioScientific Reports2045-23222025-01-011511810.1038/s41598-024-83575-1Comparison of the performances between ChatGPT and Gemini in answering questions on viral hepatitisMeryem Sahin Ozdemir0Yusuf Emre Ozdemir1Department of Infectious Diseases and Clinical Microbiology, Basaksehir Cam and Sakura City HospitalDepartment of Infectious Diseases and Clinical Microbiology, Bakirkoy Dr Sadi Konuk Training and Research HospitalAbstract This is the first study to evaluate the adequacy and reliability of the ChatGPT and Gemini chatbots on viral hepatitis. A total of 176 questions were composed from three different categories. The first group includes “questions and answers (Q&As) for the public” determined by the Centers for Disease Control and Prevention (CDC). The second group includes strong recommendations of international guidelines. The third group includes frequently asked questions on social media platforms. The answers of the chatbots were evaluated by two different infectious diseases specialists on a scoring scale from 1 to 4. Cohen’s kappa coefficient was calculated to assess inter-rater reliability. The reproducibility and correlation of answers generated by ChatGPT and Gemini were analyzed. ChatGPT and Gemini’s mean scores (3.55 ± 0.83 vs. 3.57 ± 0.89, p = 0.260) and completely correct response rates (71.0% vs. 78.4%, p = 0.111) were similar. In addition, in subgroup analyses with the CDC questions Sect. (90.1% vs. 91.9%, p = 0.752), the guideline questions Sect. (49.4% vs. 61.4%, p = 0.140), and the social media platform questions Sect. (82.5% vs. 90%, p = 0.335), the completely correct answers rates were similar. There was a moderate positive correlation between ChatGPT and Gemini chatbots’ answers (r = 0.633, p < 0.001). Reproducibility rates of answers to questions were 91.3% in ChatGPT and 92% in Gemini (p = 0.710). According to Cohen’s kappa test, there was a substantial inter-rater agreement for both ChatGPT (κ = 0.720) and Gemini (κ = 0.704). ChatGPT and Gemini successfully answered CDC questions and social media platform questions, but the correct answer rates were insufficient for guideline questions.https://doi.org/10.1038/s41598-024-83575-1ChatGPTGeminiViral hepatitisHepatitis BHepatitis C
spellingShingle	Meryem Sahin Ozdemir Yusuf Emre Ozdemir Comparison of the performances between ChatGPT and Gemini in answering questions on viral hepatitis Scientific Reports ChatGPT Gemini Viral hepatitis Hepatitis B Hepatitis C
title	Comparison of the performances between ChatGPT and Gemini in answering questions on viral hepatitis
title_full	Comparison of the performances between ChatGPT and Gemini in answering questions on viral hepatitis
title_fullStr	Comparison of the performances between ChatGPT and Gemini in answering questions on viral hepatitis
title_full_unstemmed	Comparison of the performances between ChatGPT and Gemini in answering questions on viral hepatitis
title_short	Comparison of the performances between ChatGPT and Gemini in answering questions on viral hepatitis
title_sort	comparison of the performances between chatgpt and gemini in answering questions on viral hepatitis
topic	ChatGPT Gemini Viral hepatitis Hepatitis B Hepatitis C
url	https://doi.org/10.1038/s41598-024-83575-1
work_keys_str_mv	AT meryemsahinozdemir comparisonoftheperformancesbetweenchatgptandgeminiinansweringquestionsonviralhepatitis AT yusufemreozdemir comparisonoftheperformancesbetweenchatgptandgeminiinansweringquestionsonviralhepatitis

Comparison of the performances between ChatGPT and Gemini in answering questions on viral hepatitis

Similar Items