Multifaceted Assessment of Responsible Use and Bias in Language Models for Education

Large language models (LLMs) are increasingly being utilized to develop tools and services in various domains, including education. However, due to the nature of the training data, these models are susceptible to inherent social or cognitive biases, which can influence their outputs. Furthermore, th...

Full description

Saved in:
Bibliographic Details
Main Authors: Ishrat Ahmed, Wenxing Liu, Rod D. Roscoe, Elizabeth Reilley, Danielle S. McNamara
Format: Article
Language:English
Published: MDPI AG 2025-03-01
Series:Computers
Subjects:
Online Access:https://www.mdpi.com/2073-431X/14/3/100
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Large language models (LLMs) are increasingly being utilized to develop tools and services in various domains, including education. However, due to the nature of the training data, these models are susceptible to inherent social or cognitive biases, which can influence their outputs. Furthermore, their handling of critical topics, such as privacy and sensitive questions, is essential for responsible deployment. This study proposes a framework for the automatic detection of biases and violations of responsible use using a synthetic question-based dataset mimicking student–chatbot interactions. We employ the LLM-as-a-judge method to evaluate multiple LLMs for biased responses. Our findings show that some models exhibit more bias than others, highlighting the need for careful consideration when selecting models for deployment in educational and other high-stakes applications. These results emphasize the importance of addressing bias in LLMs and implementing robust mechanisms to uphold responsible AI use in real-world services.
ISSN:2073-431X