Evaluating Mathematical Problem-Solving Abilities of Generative AI Models: Performance Analysis of o1-preview and <sc>gpt-4o</sc> Using the Korean College Scholastic Ability Test

This study utilized the Korean College Scholastic Ability Test questions to evaluate the mathematical problem-solving abilities of the latest Generative AI models, o1-preview and gpt-4o. The performance of the AI models was analyzed using 92 questions from the mathematics sections of the 2023 and 20...

Full description

Saved in:

Bibliographic Details
Main Author:	Sejun Oh
Format:	Article
Language:	English
Published:	IEEE 2025-01-01
Series:	IEEE Access
Subjects:	Generative AI mathematical problem-solving o1-preview college scholastic ability test AI educational applications
Online Access:	https://ieeexplore.ieee.org/document/10817549/
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1841563304929001472
author	Sejun Oh
author_facet	Sejun Oh
author_sort	Sejun Oh
collection	DOAJ
description	This study utilized the Korean College Scholastic Ability Test questions to evaluate the mathematical problem-solving abilities of the latest Generative AI models, o1-preview and gpt-4o. The performance of the AI models was analyzed using 92 questions from the mathematics sections of the 2023 and 2024 tests and compared with the performance of human learners. The results showed that the o1-preview model achieved an average accuracy rate of 81.52%, performing at a level comparable to top-tier human learners. The <sc>gpt-4o</sc> model demonstrated mid to lower-tier performance with an average accuracy rate of 49.46%. When analyzing different problem types, this study found that both models did better on multiple-choice questions, but their accuracy decreased as the problems got harder. AI uses reasoning processes similar to those of humans when solving mathematical problems. This study is significant because it offers new insights into AI’s mathematical abilities and shows the potential for using AI in education.
format	Article
id	doaj-art-8c72864923b644de9204502b319a946f
institution	Kabale University
issn	2169-3536
language	English
publishDate	2025-01-01
publisher	IEEE
record_format	Article
series	IEEE Access
spelling	doaj-art-8c72864923b644de9204502b319a946f2025-01-03T00:01:40ZengIEEEIEEE Access2169-35362025-01-01131227123510.1109/ACCESS.2024.352370310817549Evaluating Mathematical Problem-Solving Abilities of Generative AI Models: Performance Analysis of o1-preview and <sc>gpt-4o</sc> Using the Korean College Scholastic Ability TestSejun Oh0https://orcid.org/0000-0001-9398-2899Department of Mathematics Education, Hongik University, Seoul, South KoreaThis study utilized the Korean College Scholastic Ability Test questions to evaluate the mathematical problem-solving abilities of the latest Generative AI models, o1-preview and gpt-4o. The performance of the AI models was analyzed using 92 questions from the mathematics sections of the 2023 and 2024 tests and compared with the performance of human learners. The results showed that the o1-preview model achieved an average accuracy rate of 81.52%, performing at a level comparable to top-tier human learners. The <sc>gpt-4o</sc> model demonstrated mid to lower-tier performance with an average accuracy rate of 49.46%. When analyzing different problem types, this study found that both models did better on multiple-choice questions, but their accuracy decreased as the problems got harder. AI uses reasoning processes similar to those of humans when solving mathematical problems. This study is significant because it offers new insights into AI’s mathematical abilities and shows the potential for using AI in education.https://ieeexplore.ieee.org/document/10817549/Generative AImathematical problem-solvingo1-previewcollege scholastic ability testAI educational applications
spellingShingle	Sejun Oh Evaluating Mathematical Problem-Solving Abilities of Generative AI Models: Performance Analysis of o1-preview and <sc>gpt-4o</sc> Using the Korean College Scholastic Ability Test IEEE Access Generative AI mathematical problem-solving o1-preview college scholastic ability test AI educational applications
title	Evaluating Mathematical Problem-Solving Abilities of Generative AI Models: Performance Analysis of o1-preview and <sc>gpt-4o</sc> Using the Korean College Scholastic Ability Test
title_full	Evaluating Mathematical Problem-Solving Abilities of Generative AI Models: Performance Analysis of o1-preview and <sc>gpt-4o</sc> Using the Korean College Scholastic Ability Test
title_fullStr	Evaluating Mathematical Problem-Solving Abilities of Generative AI Models: Performance Analysis of o1-preview and <sc>gpt-4o</sc> Using the Korean College Scholastic Ability Test
title_full_unstemmed	Evaluating Mathematical Problem-Solving Abilities of Generative AI Models: Performance Analysis of o1-preview and <sc>gpt-4o</sc> Using the Korean College Scholastic Ability Test
title_short	Evaluating Mathematical Problem-Solving Abilities of Generative AI Models: Performance Analysis of o1-preview and <sc>gpt-4o</sc> Using the Korean College Scholastic Ability Test
title_sort	evaluating mathematical problem solving abilities of generative ai models performance analysis of o1 preview and sc gpt 4o sc using the korean college scholastic ability test
topic	Generative AI mathematical problem-solving o1-preview college scholastic ability test AI educational applications
url	https://ieeexplore.ieee.org/document/10817549/
work_keys_str_mv	AT sejunoh evaluatingmathematicalproblemsolvingabilitiesofgenerativeaimodelsperformanceanalysisofo1previewandscgpt4oscusingthekoreancollegescholasticabilitytest

Evaluating Mathematical Problem-Solving Abilities of Generative AI Models: Performance Analysis of o1-preview and <sc>gpt-4o</sc> Using the Korean College Scholastic Ability Test

Similar Items