The wisdom of the lexicon crowds: leveraging on decades of lexicon-based sentiment analysis for improved results

Abstract The “wisdom of the crowd” (WoC) refers to the notion that collective human knowledge is capable of outperforming even individual expert knowledge. This study investigates the application of this phenomenon to lexicon-based sentiment analysis of text data. Lexicons are frequently used to cla...

Full description

Saved in:
Bibliographic Details
Main Authors: Chelsey H. Hill, Jorge E. Fresneda, Murugan Anandarajan
Format: Article
Language:English
Published: SpringerOpen 2025-05-01
Series:Journal of Big Data
Subjects:
Online Access:https://doi.org/10.1186/s40537-025-01186-7
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Abstract The “wisdom of the crowd” (WoC) refers to the notion that collective human knowledge is capable of outperforming even individual expert knowledge. This study investigates the application of this phenomenon to lexicon-based sentiment analysis of text data. Lexicons are frequently used to classify the sentiment of text data, particularly in the absence of sentiment class label information. We propose leveraging some of the most popular, publicly-available lexicons created in the last half century to improve sentiment analysis performance. Specifically, this research argues that the collective information provided by the thirteen lexicons included in the crowd constitutes a WoC situation that can more accurately predict the sentiment in the majority of example cases when compared to individual lexicons, lexicon ensembles, and machine learning methods. Thirteen popular sentiment-labeled text datasets, comprised of different types of text data and covering a variety of domains, are used to test this research proposition. We show that the WoC sentiment analysis achieves greater performance than individual lexicons, which are considered to be ‘experts’, and a lexicon ensemble approach. In comparing our novel approach to sentiment analysis against popular machine learning approaches, the proposed WoC method achieves superior results in the majority of examples. By overcoming many of the limitations of other approaches with high accuracy, the WoC method can provide organizations with real-time, reliable, and accurate sentiment analysis.
ISSN:2196-1115