Evaluating ChatGPT-3’s efficacy in solving coding tasks: implications for academic integrity in English language assessments

Abstract The purpose of this study was to examine ChatGPT-3’s capabilities to generate code solutions for assessment problems commonly assessed by automatic correction tools in the TEFL academic setting, focusing on the Kattis platform. The researcher explored potential implications for academic int...

Full description

Saved in:
Bibliographic Details
Main Author: Seyedeh Elham Elhambakhsh
Format: Article
Language:English
Published: SpringerOpen 2025-07-01
Series:Language Testing in Asia
Subjects:
Online Access:https://doi.org/10.1186/s40468-024-00333-w
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Abstract The purpose of this study was to examine ChatGPT-3’s capabilities to generate code solutions for assessment problems commonly assessed by automatic correction tools in the TEFL academic setting, focusing on the Kattis platform. The researcher explored potential implications for academic integrity and the challenges associated with AI-generated solutions. The investigation involved testing ChatGPT on a subset of 124 English language assessment tasks from Kattis, a widely used automatic software grading tool. The results revealed that ChatGPT independently solved 16 tasks successfully. Data analysis demonstrated that while ChatGPT performed well on simpler problems, it faced challenges with more complex assessment tasks. To supplement quantitative findings, a qualitative follow-up investigation was conducted, including interviews with two EFL assessment instructors. The discussion encompassed methodological considerations, the effectiveness of Kattis in preventing cheating, and the limitations in detecting AI-generated code. ChatGPT independently solved 16 out of 124 assessment tasks assessed by Kattis. Performance varied based on task complexity, with better accuracy on simpler problems. Qualitative insights revealed both the strengths and limitations of Kattis in preventing cheating. While ChatGPT demonstrates competence in solving certain assessment problems, challenges persist with more complex tasks. The study emphasizes the need for continuous adaptation in EFL assessment methodologies to maintain academic integrity in the face of evolving AI capabilities. As students gain access to sophisticated AI-generated solutions, the need for vigilant strategies to uphold originality and critical thinking in academic work becomes increasingly crucial. The study's findings have implications for multiple stakeholders, including (1) awareness of AI capabilities in generating code solutions, necessitating vigilant assessment strategies. (2) Understanding the importance of academic integrity and the limitations of AI in mastering complex assessment tasks. (3) Insights into the interplay between AI, automated assessment systems, and academic integrity, guiding future investigations. This performance illustrates the need for careful assessment design to mitigate the risk of AI-assisted academic dishonesty while maintaining rigorous academic standards.
ISSN:2229-0443