Evaluating Mathematical Problem-Solving Abilities of Generative AI Models: Performance Analysis of o1-preview and <sc>gpt-4o</sc> Using the Korean College Scholastic Ability Test

This study utilized the Korean College Scholastic Ability Test questions to evaluate the mathematical problem-solving abilities of the latest Generative AI models, o1-preview and gpt-4o. The performance of the AI models was analyzed using 92 questions from the mathematics sections of the 2023 and 20...

Full description

Saved in:
Bibliographic Details
Main Author: Sejun Oh
Format: Article
Language:English
Published: IEEE 2025-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/10817549/
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1841563304929001472
author Sejun Oh
author_facet Sejun Oh
author_sort Sejun Oh
collection DOAJ
description This study utilized the Korean College Scholastic Ability Test questions to evaluate the mathematical problem-solving abilities of the latest Generative AI models, o1-preview and gpt-4o. The performance of the AI models was analyzed using 92 questions from the mathematics sections of the 2023 and 2024 tests and compared with the performance of human learners. The results showed that the o1-preview model achieved an average accuracy rate of 81.52%, performing at a level comparable to top-tier human learners. The <sc>gpt-4o</sc> model demonstrated mid to lower-tier performance with an average accuracy rate of 49.46%. When analyzing different problem types, this study found that both models did better on multiple-choice questions, but their accuracy decreased as the problems got harder. AI uses reasoning processes similar to those of humans when solving mathematical problems. This study is significant because it offers new insights into AI&#x2019;s mathematical abilities and shows the potential for using AI in education.
format Article
id doaj-art-8c72864923b644de9204502b319a946f
institution Kabale University
issn 2169-3536
language English
publishDate 2025-01-01
publisher IEEE
record_format Article
series IEEE Access
spelling doaj-art-8c72864923b644de9204502b319a946f2025-01-03T00:01:40ZengIEEEIEEE Access2169-35362025-01-01131227123510.1109/ACCESS.2024.352370310817549Evaluating Mathematical Problem-Solving Abilities of Generative AI Models: Performance Analysis of o1-preview and <sc>gpt-4o</sc> Using the Korean College Scholastic Ability TestSejun Oh0https://orcid.org/0000-0001-9398-2899Department of Mathematics Education, Hongik University, Seoul, South KoreaThis study utilized the Korean College Scholastic Ability Test questions to evaluate the mathematical problem-solving abilities of the latest Generative AI models, o1-preview and gpt-4o. The performance of the AI models was analyzed using 92 questions from the mathematics sections of the 2023 and 2024 tests and compared with the performance of human learners. The results showed that the o1-preview model achieved an average accuracy rate of 81.52%, performing at a level comparable to top-tier human learners. The <sc>gpt-4o</sc> model demonstrated mid to lower-tier performance with an average accuracy rate of 49.46%. When analyzing different problem types, this study found that both models did better on multiple-choice questions, but their accuracy decreased as the problems got harder. AI uses reasoning processes similar to those of humans when solving mathematical problems. This study is significant because it offers new insights into AI&#x2019;s mathematical abilities and shows the potential for using AI in education.https://ieeexplore.ieee.org/document/10817549/Generative AImathematical problem-solvingo1-previewcollege scholastic ability testAI educational applications
spellingShingle Sejun Oh
Evaluating Mathematical Problem-Solving Abilities of Generative AI Models: Performance Analysis of o1-preview and <sc>gpt-4o</sc> Using the Korean College Scholastic Ability Test
IEEE Access
Generative AI
mathematical problem-solving
o1-preview
college scholastic ability test
AI educational applications
title Evaluating Mathematical Problem-Solving Abilities of Generative AI Models: Performance Analysis of o1-preview and <sc>gpt-4o</sc> Using the Korean College Scholastic Ability Test
title_full Evaluating Mathematical Problem-Solving Abilities of Generative AI Models: Performance Analysis of o1-preview and <sc>gpt-4o</sc> Using the Korean College Scholastic Ability Test
title_fullStr Evaluating Mathematical Problem-Solving Abilities of Generative AI Models: Performance Analysis of o1-preview and <sc>gpt-4o</sc> Using the Korean College Scholastic Ability Test
title_full_unstemmed Evaluating Mathematical Problem-Solving Abilities of Generative AI Models: Performance Analysis of o1-preview and <sc>gpt-4o</sc> Using the Korean College Scholastic Ability Test
title_short Evaluating Mathematical Problem-Solving Abilities of Generative AI Models: Performance Analysis of o1-preview and <sc>gpt-4o</sc> Using the Korean College Scholastic Ability Test
title_sort evaluating mathematical problem solving abilities of generative ai models performance analysis of o1 preview and sc gpt 4o sc using the korean college scholastic ability test
topic Generative AI
mathematical problem-solving
o1-preview
college scholastic ability test
AI educational applications
url https://ieeexplore.ieee.org/document/10817549/
work_keys_str_mv AT sejunoh evaluatingmathematicalproblemsolvingabilitiesofgenerativeaimodelsperformanceanalysisofo1previewandscgpt4oscusingthekoreancollegescholasticabilitytest