A framework for evaluating cultural bias and historical misconceptions in LLMs outputs

Large Language Models (LLMs), while powerful, often perpetuate cultural biases and historical inaccuracies from their training data, marginalizing underrepresented perspectives. To address these issues, we introduce a structured framework to systematically evaluate and quantify these deficiencies. O...

Full description

Saved in:

Bibliographic Details
Main Authors:	Moon-Kuen Mak, Tiejian Luo
Format:	Article
Language:	English
Published:	KeAi Communications Co. Ltd. 2025-09-01
Series:	BenchCouncil Transactions on Benchmarks, Standards and Evaluations
Subjects:	Large language model Artificial intelligence Cultural bias Historical misconception human-in-the-loop
Online Access:	http://www.sciencedirect.com/science/article/pii/S2772485925000481
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	Large Language Models (LLMs), while powerful, often perpetuate cultural biases and historical inaccuracies from their training data, marginalizing underrepresented perspectives. To address these issues, we introduce a structured framework to systematically evaluate and quantify these deficiencies. Our methodology combines culturally sensitive prompting with two novel metrics: the Cultural Bias Score (CBS) and the Historical Misconception Score (HMS). Our analysis reveals varying cultural biases across LLMs, with certain Western-centric models, such as Gemini, exhibiting higher bias. In contrast, other models, including ChatGPT and Poe, demonstrate more balanced cultural narratives. We also find that historical misconceptions are most prevalent for less-documented events, underscoring the critical need for training data diversification. Our framework suggests the potential effectiveness of bias-mitigation techniques, including dataset augmentation and human-in-the-loop (HITL) verification. Empirical validation of these strategies remains an important direction for future work. This work provides a replicable and scalable methodology for developers and researchers to help ensure the responsible and equitable deployment of LLMs in critical domains such as education and content moderation.
ISSN:	2772-4859

A framework for evaluating cultural bias and historical misconceptions in LLMs outputs

Similar Items