The performance of GPT-3.5 and GPT-4 on genetic tests at PhD-level: GPT-4 as a promising tool for genomic medicine and education

Background: Natural Language Processing (NLP) has empowered AI models to understand and generate human language, with transformer-based architectures like GPT-3 and GPT-4 marking significant advancements. GPT-4, equipped with a larger parameter count and multimodal capabilities, offers enhanced accu...

Full description

Saved in:
Bibliographic Details
Main Authors: Teymoor Khosravi, Arian Rahimzadeh, Farzaneh Motallebi, Fatemeh Vaghefi, Zainab Mohammad Al Sudani, Morteza Oladnabi
Format: Article
Language:English
Published: Golestan University Of Medical Sciences 2024-12-01
Series:Journal of Clinical and Basic Research
Subjects:
Online Access:http://jcbr.goums.ac.ir/article-1-476-en.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849311434555523072
author Teymoor Khosravi
Arian Rahimzadeh
Farzaneh Motallebi
Fatemeh Vaghefi
Zainab Mohammad Al Sudani
Morteza Oladnabi
author_facet Teymoor Khosravi
Arian Rahimzadeh
Farzaneh Motallebi
Fatemeh Vaghefi
Zainab Mohammad Al Sudani
Morteza Oladnabi
author_sort Teymoor Khosravi
collection DOAJ
description Background: Natural Language Processing (NLP) has empowered AI models to understand and generate human language, with transformer-based architectures like GPT-3 and GPT-4 marking significant advancements. GPT-4, equipped with a larger parameter count and multimodal capabilities, offers enhanced accuracy and contextual understanding over its predecessor, GPT-3.5. However, challenges such as factual inaccuracies remain. This study aims to evaluate GPT-4’s performance on genetics-related tasks, assessing its strengths and limitations compared to GPT-3.5. Methods: We assessed GPT-4's performance across five key genetic tasks: (1) understanding basic genetic concepts, (2) interpreting family pedigrees, (3) analyzing genetic mutations, (4) solving population genetics problems, and (5) answering medical genetics Ph.D. entrance exam questions. Both open-ended and multiple-choice questions (MCQs) were used, some of which required forced justification to evaluate reasoning. GPT-4’s multimodal capabilities were also tested using pedigree images for inheritance pattern analysis. Results: GPT-4 demonstrated perfect accuracy in Task 1 (basic genetic concepts) and Task 3 (genetic mutation interpretation), correctly answering all 10 and 16 questions, respectively. In Task 2 (pedigree analysis), GPT-4 answered 24 out of 71 questions correctly, with 47 incorrect responses. For Task 4 (population genetics problems), GPT-4 provided 30 correct answers out of 34. In Task 5, which assessed performance on a Ph.D. entrance exam, GPT-4 correctly answered 58 out of 80 questions. Performance was notably higher for MCQs than for open-ended questions. Conclusion: GPT-4 substantially improves over GPT-3.5, particularly in understanding genetic concepts and interpreting genetic mutations. Despite these advances, its performance in more complex tasks, such as pedigree analysis, reveals areas that require further refinement. These findings highlight GPT-4's potential in advancing genetic education and research. Future studies should further explore GPT-4's capabilities and address its limitations in tasks that demand higher reasoning and factual accuracy.
format Article
id doaj-art-02b4e0a4a1324a3ba1eccaf0502c8b18
institution Kabale University
issn 2538-3736
language English
publishDate 2024-12-01
publisher Golestan University Of Medical Sciences
record_format Article
series Journal of Clinical and Basic Research
spelling doaj-art-02b4e0a4a1324a3ba1eccaf0502c8b182025-08-20T03:53:23ZengGolestan University Of Medical SciencesJournal of Clinical and Basic Research2538-37362024-12-01842226The performance of GPT-3.5 and GPT-4 on genetic tests at PhD-level: GPT-4 as a promising tool for genomic medicine and educationTeymoor Khosravi0Arian Rahimzadeh1Farzaneh Motallebi2Fatemeh Vaghefi3Zainab Mohammad Al Sudani4Morteza Oladnabi5 Student Research Committee, Golestan University of Medical Sciences, Gorgan, Iran Student Research Committee, Golestan University of Medical Sciences, Gorgan, Iran Student Research Committee, Golestan University of Medical Sciences, Gorgan, Iran Student Research Committee, Golestan University of Medical Sciences, Gorgan, Iran Student Research Committee, Golestan University of Medical Sciences, Gorgan, Iran Gorgan Congenital Malformations Research Center, Golestan University of Medical Sciences, Gorgan, Iran , Department of Medical Genetics, School of Advanced Technologies in Medicine, Golestan University of Medical Sciences, Gorgan, Iran , Ischemic Disorders Research Center, Golestan University of Medical Sciences, Gorgan, Iran Background: Natural Language Processing (NLP) has empowered AI models to understand and generate human language, with transformer-based architectures like GPT-3 and GPT-4 marking significant advancements. GPT-4, equipped with a larger parameter count and multimodal capabilities, offers enhanced accuracy and contextual understanding over its predecessor, GPT-3.5. However, challenges such as factual inaccuracies remain. This study aims to evaluate GPT-4’s performance on genetics-related tasks, assessing its strengths and limitations compared to GPT-3.5. Methods: We assessed GPT-4's performance across five key genetic tasks: (1) understanding basic genetic concepts, (2) interpreting family pedigrees, (3) analyzing genetic mutations, (4) solving population genetics problems, and (5) answering medical genetics Ph.D. entrance exam questions. Both open-ended and multiple-choice questions (MCQs) were used, some of which required forced justification to evaluate reasoning. GPT-4’s multimodal capabilities were also tested using pedigree images for inheritance pattern analysis. Results: GPT-4 demonstrated perfect accuracy in Task 1 (basic genetic concepts) and Task 3 (genetic mutation interpretation), correctly answering all 10 and 16 questions, respectively. In Task 2 (pedigree analysis), GPT-4 answered 24 out of 71 questions correctly, with 47 incorrect responses. For Task 4 (population genetics problems), GPT-4 provided 30 correct answers out of 34. In Task 5, which assessed performance on a Ph.D. entrance exam, GPT-4 correctly answered 58 out of 80 questions. Performance was notably higher for MCQs than for open-ended questions. Conclusion: GPT-4 substantially improves over GPT-3.5, particularly in understanding genetic concepts and interpreting genetic mutations. Despite these advances, its performance in more complex tasks, such as pedigree analysis, reveals areas that require further refinement. These findings highlight GPT-4's potential in advancing genetic education and research. Future studies should further explore GPT-4's capabilities and address its limitations in tasks that demand higher reasoning and factual accuracy.http://jcbr.goums.ac.ir/article-1-476-en.pdfnatural language processinggenerative artificial intelligencegenetics
spellingShingle Teymoor Khosravi
Arian Rahimzadeh
Farzaneh Motallebi
Fatemeh Vaghefi
Zainab Mohammad Al Sudani
Morteza Oladnabi
The performance of GPT-3.5 and GPT-4 on genetic tests at PhD-level: GPT-4 as a promising tool for genomic medicine and education
Journal of Clinical and Basic Research
natural language processing
generative artificial intelligence
genetics
title The performance of GPT-3.5 and GPT-4 on genetic tests at PhD-level: GPT-4 as a promising tool for genomic medicine and education
title_full The performance of GPT-3.5 and GPT-4 on genetic tests at PhD-level: GPT-4 as a promising tool for genomic medicine and education
title_fullStr The performance of GPT-3.5 and GPT-4 on genetic tests at PhD-level: GPT-4 as a promising tool for genomic medicine and education
title_full_unstemmed The performance of GPT-3.5 and GPT-4 on genetic tests at PhD-level: GPT-4 as a promising tool for genomic medicine and education
title_short The performance of GPT-3.5 and GPT-4 on genetic tests at PhD-level: GPT-4 as a promising tool for genomic medicine and education
title_sort performance of gpt 3 5 and gpt 4 on genetic tests at phd level gpt 4 as a promising tool for genomic medicine and education
topic natural language processing
generative artificial intelligence
genetics
url http://jcbr.goums.ac.ir/article-1-476-en.pdf
work_keys_str_mv AT teymoorkhosravi theperformanceofgpt35andgpt4ongenetictestsatphdlevelgpt4asapromisingtoolforgenomicmedicineandeducation
AT arianrahimzadeh theperformanceofgpt35andgpt4ongenetictestsatphdlevelgpt4asapromisingtoolforgenomicmedicineandeducation
AT farzanehmotallebi theperformanceofgpt35andgpt4ongenetictestsatphdlevelgpt4asapromisingtoolforgenomicmedicineandeducation
AT fatemehvaghefi theperformanceofgpt35andgpt4ongenetictestsatphdlevelgpt4asapromisingtoolforgenomicmedicineandeducation
AT zainabmohammadalsudani theperformanceofgpt35andgpt4ongenetictestsatphdlevelgpt4asapromisingtoolforgenomicmedicineandeducation
AT mortezaoladnabi theperformanceofgpt35andgpt4ongenetictestsatphdlevelgpt4asapromisingtoolforgenomicmedicineandeducation
AT teymoorkhosravi performanceofgpt35andgpt4ongenetictestsatphdlevelgpt4asapromisingtoolforgenomicmedicineandeducation
AT arianrahimzadeh performanceofgpt35andgpt4ongenetictestsatphdlevelgpt4asapromisingtoolforgenomicmedicineandeducation
AT farzanehmotallebi performanceofgpt35andgpt4ongenetictestsatphdlevelgpt4asapromisingtoolforgenomicmedicineandeducation
AT fatemehvaghefi performanceofgpt35andgpt4ongenetictestsatphdlevelgpt4asapromisingtoolforgenomicmedicineandeducation
AT zainabmohammadalsudani performanceofgpt35andgpt4ongenetictestsatphdlevelgpt4asapromisingtoolforgenomicmedicineandeducation
AT mortezaoladnabi performanceofgpt35andgpt4ongenetictestsatphdlevelgpt4asapromisingtoolforgenomicmedicineandeducation