A framework for evaluating cultural bias and historical misconceptions in LLMs outputs
Large Language Models (LLMs), while powerful, often perpetuate cultural biases and historical inaccuracies from their training data, marginalizing underrepresented perspectives. To address these issues, we introduce a structured framework to systematically evaluate and quantify these deficiencies. O...
Saved in:
| Main Authors: | , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
KeAi Communications Co. Ltd.
2025-09-01
|
| Series: | BenchCouncil Transactions on Benchmarks, Standards and Evaluations |
| Subjects: | |
| Online Access: | http://www.sciencedirect.com/science/article/pii/S2772485925000481 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Summary: | Large Language Models (LLMs), while powerful, often perpetuate cultural biases and historical inaccuracies from their training data, marginalizing underrepresented perspectives. To address these issues, we introduce a structured framework to systematically evaluate and quantify these deficiencies. Our methodology combines culturally sensitive prompting with two novel metrics: the Cultural Bias Score (CBS) and the Historical Misconception Score (HMS). Our analysis reveals varying cultural biases across LLMs, with certain Western-centric models, such as Gemini, exhibiting higher bias. In contrast, other models, including ChatGPT and Poe, demonstrate more balanced cultural narratives. We also find that historical misconceptions are most prevalent for less-documented events, underscoring the critical need for training data diversification. Our framework suggests the potential effectiveness of bias-mitigation techniques, including dataset augmentation and human-in-the-loop (HITL) verification. Empirical validation of these strategies remains an important direction for future work. This work provides a replicable and scalable methodology for developers and researchers to help ensure the responsible and equitable deployment of LLMs in critical domains such as education and content moderation. |
|---|---|
| ISSN: | 2772-4859 |