Evaluating Mathematical Problem-Solving Abilities of Generative AI Models: Performance Analysis of o1-preview and <sc>gpt-4o</sc> Using the Korean College Scholastic Ability Test

This study utilized the Korean College Scholastic Ability Test questions to evaluate the mathematical problem-solving abilities of the latest Generative AI models, o1-preview and gpt-4o. The performance of the AI models was analyzed using 92 questions from the mathematics sections of the 2023 and 20...

Full description

Saved in:

Bibliographic Details
Main Author:	Sejun Oh
Format:	Article
Language:	English
Published:	IEEE 2025-01-01
Series:	IEEE Access
Subjects:	Generative AI mathematical problem-solving o1-preview college scholastic ability test AI educational applications
Online Access:	https://ieeexplore.ieee.org/document/10817549/
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	This study utilized the Korean College Scholastic Ability Test questions to evaluate the mathematical problem-solving abilities of the latest Generative AI models, o1-preview and gpt-4o. The performance of the AI models was analyzed using 92 questions from the mathematics sections of the 2023 and 2024 tests and compared with the performance of human learners. The results showed that the o1-preview model achieved an average accuracy rate of 81.52%, performing at a level comparable to top-tier human learners. The <sc>gpt-4o</sc> model demonstrated mid to lower-tier performance with an average accuracy rate of 49.46%. When analyzing different problem types, this study found that both models did better on multiple-choice questions, but their accuracy decreased as the problems got harder. AI uses reasoning processes similar to those of humans when solving mathematical problems. This study is significant because it offers new insights into AI’s mathematical abilities and shows the potential for using AI in education.
ISSN:	2169-3536

Evaluating Mathematical Problem-Solving Abilities of Generative AI Models: Performance Analysis of o1-preview and <sc>gpt-4o</sc> Using the Korean College Scholastic Ability Test

Similar Items