BdSentiLLM: A Novel LLM Approach to Sentiment Analysis of Product Reviews

Online communication has led to more people expressing themselves in their preferred languages, especially in e-commerce, where product reviews are crucial. Understanding customer sentiment through product reviews and comments can help businesses improve product quality and make informed decisions....

Full description

Saved in:
Bibliographic Details
Main Authors: Atia Shahnaz Ipa, Priyo Nath Roy, Mohammad Abu Tareq Rony, Ali Raza, Norma Latif Fitriyani, Yeonghyeon Gu, Muhammad Syafrudin
Format: Article
Language:English
Published: IEEE 2024-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/10798428/
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Online communication has led to more people expressing themselves in their preferred languages, especially in e-commerce, where product reviews are crucial. Understanding customer sentiment through product reviews and comments can help businesses improve product quality and make informed decisions. However, the complexity of written language and the variety of languages used in reviews pose challenges for accurate sentiment analysis. In this study, we explored the linguistic landscape of Bangladeshi product reviews and developed BdSentiLLM, a robust model designed for automatic language classification and sentiment analysis in this context. We collected a dataset of 3,864 product reviews, revealing that 84% were written in English, followed by Bangla, Banglish (Romanized Bangla), and Bangla-English code-switched content. BdSentiLLM can categorize and prepare these language types for sentiment analysis with large language models. We evaluated the performance of four open-source LLMs, Llama-2, Flan-t5, Vicuna, and Falcon, using BdSentiLLM for sentiment analysis.BdSentiLLM with Llama-2 consistently outperformed the other models across most language categories with f1 score of 0.79 for Bangla, 0.70 for Banglish, 0.84 for Bangla_English, 0.90 for English, and 0.89 overall, while Flan-t5 excelled in English sentiment analysis. Compared to existing models, BdSentiLLM demonstrated superior versatility and effectiveness by handling mixed-language data across all categories making it a valuable tool for sentiment analysis in diverse linguistic contexts. Future work will focus on expanding the dataset to enhance BdSentiLLM’s robustness and exploring its applicability beyond e-commerce to broader multilingual sentiment analysis tasks.
ISSN:2169-3536