The performance of GPT-3.5 and GPT-4 on genetic tests at PhD-level: GPT-4 as a promising tool for genomic medicine and education
Background: Natural Language Processing (NLP) has empowered AI models to understand and generate human language, with transformer-based architectures like GPT-3 and GPT-4 marking significant advancements. GPT-4, equipped with a larger parameter count and multimodal capabilities, offers enhanced accu...
Saved in:
| Main Authors: | , , , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
Golestan University Of Medical Sciences
2024-12-01
|
| Series: | Journal of Clinical and Basic Research |
| Subjects: | |
| Online Access: | http://jcbr.goums.ac.ir/article-1-476-en.pdf |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1849311434555523072 |
|---|---|
| author | Teymoor Khosravi Arian Rahimzadeh Farzaneh Motallebi Fatemeh Vaghefi Zainab Mohammad Al Sudani Morteza Oladnabi |
| author_facet | Teymoor Khosravi Arian Rahimzadeh Farzaneh Motallebi Fatemeh Vaghefi Zainab Mohammad Al Sudani Morteza Oladnabi |
| author_sort | Teymoor Khosravi |
| collection | DOAJ |
| description | Background: Natural Language Processing (NLP) has empowered AI models to understand and generate human language, with transformer-based architectures like GPT-3 and GPT-4 marking significant advancements. GPT-4, equipped with a larger parameter count and multimodal capabilities, offers enhanced accuracy and contextual understanding over its predecessor, GPT-3.5. However, challenges such as factual inaccuracies remain. This study aims to evaluate GPT-4’s performance on genetics-related tasks, assessing its strengths and limitations compared to GPT-3.5.
Methods: We assessed GPT-4's performance across five key genetic tasks: (1) understanding basic genetic concepts, (2) interpreting family pedigrees, (3) analyzing genetic mutations, (4) solving population genetics problems, and (5) answering medical genetics Ph.D. entrance exam questions. Both open-ended and multiple-choice questions (MCQs) were used, some of which required forced justification to evaluate reasoning. GPT-4’s multimodal capabilities were also tested using pedigree images for inheritance pattern analysis.
Results: GPT-4 demonstrated perfect accuracy in Task 1 (basic genetic concepts) and Task 3 (genetic mutation interpretation), correctly answering all 10 and 16 questions, respectively. In Task 2 (pedigree analysis), GPT-4 answered 24 out of 71 questions correctly, with 47 incorrect responses. For Task 4 (population genetics problems), GPT-4 provided 30 correct answers out of 34. In Task 5, which assessed performance on a Ph.D. entrance exam, GPT-4 correctly answered 58 out of 80 questions. Performance was notably higher for MCQs than for open-ended questions.
Conclusion: GPT-4 substantially improves over GPT-3.5, particularly in understanding genetic concepts and interpreting genetic mutations. Despite these advances, its performance in more complex tasks, such as pedigree analysis, reveals areas that require further refinement. These findings highlight GPT-4's potential in advancing genetic education and research. Future studies should further explore GPT-4's capabilities and address its limitations in tasks that demand higher reasoning and factual accuracy. |
| format | Article |
| id | doaj-art-02b4e0a4a1324a3ba1eccaf0502c8b18 |
| institution | Kabale University |
| issn | 2538-3736 |
| language | English |
| publishDate | 2024-12-01 |
| publisher | Golestan University Of Medical Sciences |
| record_format | Article |
| series | Journal of Clinical and Basic Research |
| spelling | doaj-art-02b4e0a4a1324a3ba1eccaf0502c8b182025-08-20T03:53:23ZengGolestan University Of Medical SciencesJournal of Clinical and Basic Research2538-37362024-12-01842226The performance of GPT-3.5 and GPT-4 on genetic tests at PhD-level: GPT-4 as a promising tool for genomic medicine and educationTeymoor Khosravi0Arian Rahimzadeh1Farzaneh Motallebi2Fatemeh Vaghefi3Zainab Mohammad Al Sudani4Morteza Oladnabi5 Student Research Committee, Golestan University of Medical Sciences, Gorgan, Iran Student Research Committee, Golestan University of Medical Sciences, Gorgan, Iran Student Research Committee, Golestan University of Medical Sciences, Gorgan, Iran Student Research Committee, Golestan University of Medical Sciences, Gorgan, Iran Student Research Committee, Golestan University of Medical Sciences, Gorgan, Iran Gorgan Congenital Malformations Research Center, Golestan University of Medical Sciences, Gorgan, Iran , Department of Medical Genetics, School of Advanced Technologies in Medicine, Golestan University of Medical Sciences, Gorgan, Iran , Ischemic Disorders Research Center, Golestan University of Medical Sciences, Gorgan, Iran Background: Natural Language Processing (NLP) has empowered AI models to understand and generate human language, with transformer-based architectures like GPT-3 and GPT-4 marking significant advancements. GPT-4, equipped with a larger parameter count and multimodal capabilities, offers enhanced accuracy and contextual understanding over its predecessor, GPT-3.5. However, challenges such as factual inaccuracies remain. This study aims to evaluate GPT-4’s performance on genetics-related tasks, assessing its strengths and limitations compared to GPT-3.5. Methods: We assessed GPT-4's performance across five key genetic tasks: (1) understanding basic genetic concepts, (2) interpreting family pedigrees, (3) analyzing genetic mutations, (4) solving population genetics problems, and (5) answering medical genetics Ph.D. entrance exam questions. Both open-ended and multiple-choice questions (MCQs) were used, some of which required forced justification to evaluate reasoning. GPT-4’s multimodal capabilities were also tested using pedigree images for inheritance pattern analysis. Results: GPT-4 demonstrated perfect accuracy in Task 1 (basic genetic concepts) and Task 3 (genetic mutation interpretation), correctly answering all 10 and 16 questions, respectively. In Task 2 (pedigree analysis), GPT-4 answered 24 out of 71 questions correctly, with 47 incorrect responses. For Task 4 (population genetics problems), GPT-4 provided 30 correct answers out of 34. In Task 5, which assessed performance on a Ph.D. entrance exam, GPT-4 correctly answered 58 out of 80 questions. Performance was notably higher for MCQs than for open-ended questions. Conclusion: GPT-4 substantially improves over GPT-3.5, particularly in understanding genetic concepts and interpreting genetic mutations. Despite these advances, its performance in more complex tasks, such as pedigree analysis, reveals areas that require further refinement. These findings highlight GPT-4's potential in advancing genetic education and research. Future studies should further explore GPT-4's capabilities and address its limitations in tasks that demand higher reasoning and factual accuracy.http://jcbr.goums.ac.ir/article-1-476-en.pdfnatural language processinggenerative artificial intelligencegenetics |
| spellingShingle | Teymoor Khosravi Arian Rahimzadeh Farzaneh Motallebi Fatemeh Vaghefi Zainab Mohammad Al Sudani Morteza Oladnabi The performance of GPT-3.5 and GPT-4 on genetic tests at PhD-level: GPT-4 as a promising tool for genomic medicine and education Journal of Clinical and Basic Research natural language processing generative artificial intelligence genetics |
| title | The performance of GPT-3.5 and GPT-4 on genetic tests at PhD-level: GPT-4 as a promising tool for genomic medicine and education |
| title_full | The performance of GPT-3.5 and GPT-4 on genetic tests at PhD-level: GPT-4 as a promising tool for genomic medicine and education |
| title_fullStr | The performance of GPT-3.5 and GPT-4 on genetic tests at PhD-level: GPT-4 as a promising tool for genomic medicine and education |
| title_full_unstemmed | The performance of GPT-3.5 and GPT-4 on genetic tests at PhD-level: GPT-4 as a promising tool for genomic medicine and education |
| title_short | The performance of GPT-3.5 and GPT-4 on genetic tests at PhD-level: GPT-4 as a promising tool for genomic medicine and education |
| title_sort | performance of gpt 3 5 and gpt 4 on genetic tests at phd level gpt 4 as a promising tool for genomic medicine and education |
| topic | natural language processing generative artificial intelligence genetics |
| url | http://jcbr.goums.ac.ir/article-1-476-en.pdf |
| work_keys_str_mv | AT teymoorkhosravi theperformanceofgpt35andgpt4ongenetictestsatphdlevelgpt4asapromisingtoolforgenomicmedicineandeducation AT arianrahimzadeh theperformanceofgpt35andgpt4ongenetictestsatphdlevelgpt4asapromisingtoolforgenomicmedicineandeducation AT farzanehmotallebi theperformanceofgpt35andgpt4ongenetictestsatphdlevelgpt4asapromisingtoolforgenomicmedicineandeducation AT fatemehvaghefi theperformanceofgpt35andgpt4ongenetictestsatphdlevelgpt4asapromisingtoolforgenomicmedicineandeducation AT zainabmohammadalsudani theperformanceofgpt35andgpt4ongenetictestsatphdlevelgpt4asapromisingtoolforgenomicmedicineandeducation AT mortezaoladnabi theperformanceofgpt35andgpt4ongenetictestsatphdlevelgpt4asapromisingtoolforgenomicmedicineandeducation AT teymoorkhosravi performanceofgpt35andgpt4ongenetictestsatphdlevelgpt4asapromisingtoolforgenomicmedicineandeducation AT arianrahimzadeh performanceofgpt35andgpt4ongenetictestsatphdlevelgpt4asapromisingtoolforgenomicmedicineandeducation AT farzanehmotallebi performanceofgpt35andgpt4ongenetictestsatphdlevelgpt4asapromisingtoolforgenomicmedicineandeducation AT fatemehvaghefi performanceofgpt35andgpt4ongenetictestsatphdlevelgpt4asapromisingtoolforgenomicmedicineandeducation AT zainabmohammadalsudani performanceofgpt35andgpt4ongenetictestsatphdlevelgpt4asapromisingtoolforgenomicmedicineandeducation AT mortezaoladnabi performanceofgpt35andgpt4ongenetictestsatphdlevelgpt4asapromisingtoolforgenomicmedicineandeducation |