Evaluating Mathematical Problem-Solving Abilities of Generative AI Models: Performance Analysis of o1-preview and <sc>gpt-4o</sc> Using the Korean College Scholastic Ability Test
This study utilized the Korean College Scholastic Ability Test questions to evaluate the mathematical problem-solving abilities of the latest Generative AI models, o1-preview and gpt-4o. The performance of the AI models was analyzed using 92 questions from the mathematics sections of the 2023 and 20...
Saved in:
Main Author: | |
---|---|
Format: | Article |
Language: | English |
Published: |
IEEE
2025-01-01
|
Series: | IEEE Access |
Subjects: | |
Online Access: | https://ieeexplore.ieee.org/document/10817549/ |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | This study utilized the Korean College Scholastic Ability Test questions to evaluate the mathematical problem-solving abilities of the latest Generative AI models, o1-preview and gpt-4o. The performance of the AI models was analyzed using 92 questions from the mathematics sections of the 2023 and 2024 tests and compared with the performance of human learners. The results showed that the o1-preview model achieved an average accuracy rate of 81.52%, performing at a level comparable to top-tier human learners. The <sc>gpt-4o</sc> model demonstrated mid to lower-tier performance with an average accuracy rate of 49.46%. When analyzing different problem types, this study found that both models did better on multiple-choice questions, but their accuracy decreased as the problems got harder. AI uses reasoning processes similar to those of humans when solving mathematical problems. This study is significant because it offers new insights into AI’s mathematical abilities and shows the potential for using AI in education. |
---|---|
ISSN: | 2169-3536 |