Fine-Tuning Retrieval-Augmented Generation with an Auto-Regressive Language Model for Sentiment Analysis in Financial Reviews

Sentiment analysis is a well-known task that has been used to analyse customer feedback reviews and media headlines to detect the sentimental personality or polarisation of a given text. With the growth of social media and other online platforms, like Twitter (now branded as X), Facebook, blogs, and...

Full description

Saved in:

Bibliographic Details
Main Authors:	Miehleketo Mathebula, Abiodun Modupe, Vukosi Marivate
Format:	Article
Language:	English
Published:	MDPI AG 2024-11-01
Series:	Applied Sciences
Subjects:	large language models sentiment analysis retrieval-augmented generation prompt engineering conversational fine-tuning retrieval augmented generation assessment
Online Access:	https://www.mdpi.com/2076-3417/14/23/10782
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1846124492865142784
author	Miehleketo Mathebula Abiodun Modupe Vukosi Marivate
author_facet	Miehleketo Mathebula Abiodun Modupe Vukosi Marivate
author_sort	Miehleketo Mathebula
collection	DOAJ
description	Sentiment analysis is a well-known task that has been used to analyse customer feedback reviews and media headlines to detect the sentimental personality or polarisation of a given text. With the growth of social media and other online platforms, like Twitter (now branded as X), Facebook, blogs, and others, it has been used in the investment community to monitor customer feedback, reviews, and news headlines about financial institutions’ products and services to ensure business success and prioritise aspects of customer relationship management. Supervised learning algorithms have been popularly employed for this task, but the performance of these models has been compromised due to the brevity of the content and the presence of idiomatic expressions, sound imitations, and abbreviations. Additionally, the pre-training of a larger language model (PTLM) struggles to capture bidirectional contextual knowledge learnt through word dependency because the sentence-level representation fails to take broad features into account. We develop a novel structure called language feature extraction and adaptation for reviews (LFEAR), an advanced natural language model that amalgamates retrieval-augmented generation (RAG) with a conversation format for an auto-regressive fine-tuning model (ARFT). This helps to overcome the limitations of lexicon-based tools and the reliance on pre-defined sentiment lexicons, which may not fully capture the range of sentiments in natural language and address questions on various topics and tasks. LFEAR is fine-tuned on Hellopeter reviews that incorporate industry-specific contextual information retrieval to show resilience and flexibility for various tasks, including analysing sentiments in reviews of restaurants, movies, politics, and financial products. The proposed model achieved an average precision score of 98.45%, answer correctness of 93.85%, and context precision of 97.69% based on Retrieval-Augmented Generation Assessment (RAGAS) metrics. The LFEAR model is effective in conducting sentiment analysis across various domains due to its adaptability and scalable inference mechanism. It considers unique language characteristics and patterns in specific domains to ensure accurate sentiment annotation. This is particularly beneficial for individuals in the financial sector, such as investors and institutions, including those listed on the Johannesburg Stock Exchange (JSE), which is the primary stock exchange in South Africa and plays a significant role in the country’s financial market. Future initiatives will focus on incorporating a wider range of data sources and improving the system’s ability to express nuanced sentiments effectively, enhancing its usefulness in diverse real-world scenarios.
format	Article
id	doaj-art-32b3eb95cdd6481bb5a8a4c3ba97f71a
institution	Kabale University
issn	2076-3417
language	English
publishDate	2024-11-01
publisher	MDPI AG
record_format	Article
series	Applied Sciences
spelling	doaj-art-32b3eb95cdd6481bb5a8a4c3ba97f71a2024-12-13T16:21:43ZengMDPI AGApplied Sciences2076-34172024-11-0114231078210.3390/app142310782Fine-Tuning Retrieval-Augmented Generation with an Auto-Regressive Language Model for Sentiment Analysis in Financial ReviewsMiehleketo Mathebula0Abiodun Modupe1Vukosi Marivate2Department of Computer Science, University of Pretoria, Lynnwood Road, Pretoria 0002, South AfricaDepartment of Computer Science, University of Pretoria, Lynnwood Road, Pretoria 0002, South AfricaDepartment of Computer Science, University of Pretoria, Lynnwood Road, Pretoria 0002, South AfricaSentiment analysis is a well-known task that has been used to analyse customer feedback reviews and media headlines to detect the sentimental personality or polarisation of a given text. With the growth of social media and other online platforms, like Twitter (now branded as X), Facebook, blogs, and others, it has been used in the investment community to monitor customer feedback, reviews, and news headlines about financial institutions’ products and services to ensure business success and prioritise aspects of customer relationship management. Supervised learning algorithms have been popularly employed for this task, but the performance of these models has been compromised due to the brevity of the content and the presence of idiomatic expressions, sound imitations, and abbreviations. Additionally, the pre-training of a larger language model (PTLM) struggles to capture bidirectional contextual knowledge learnt through word dependency because the sentence-level representation fails to take broad features into account. We develop a novel structure called language feature extraction and adaptation for reviews (LFEAR), an advanced natural language model that amalgamates retrieval-augmented generation (RAG) with a conversation format for an auto-regressive fine-tuning model (ARFT). This helps to overcome the limitations of lexicon-based tools and the reliance on pre-defined sentiment lexicons, which may not fully capture the range of sentiments in natural language and address questions on various topics and tasks. LFEAR is fine-tuned on Hellopeter reviews that incorporate industry-specific contextual information retrieval to show resilience and flexibility for various tasks, including analysing sentiments in reviews of restaurants, movies, politics, and financial products. The proposed model achieved an average precision score of 98.45%, answer correctness of 93.85%, and context precision of 97.69% based on Retrieval-Augmented Generation Assessment (RAGAS) metrics. The LFEAR model is effective in conducting sentiment analysis across various domains due to its adaptability and scalable inference mechanism. It considers unique language characteristics and patterns in specific domains to ensure accurate sentiment annotation. This is particularly beneficial for individuals in the financial sector, such as investors and institutions, including those listed on the Johannesburg Stock Exchange (JSE), which is the primary stock exchange in South Africa and plays a significant role in the country’s financial market. Future initiatives will focus on incorporating a wider range of data sources and improving the system’s ability to express nuanced sentiments effectively, enhancing its usefulness in diverse real-world scenarios.https://www.mdpi.com/2076-3417/14/23/10782large language modelssentiment analysisretrieval-augmented generationprompt engineeringconversational fine-tuningretrieval augmented generation assessment
spellingShingle	Miehleketo Mathebula Abiodun Modupe Vukosi Marivate Fine-Tuning Retrieval-Augmented Generation with an Auto-Regressive Language Model for Sentiment Analysis in Financial Reviews Applied Sciences large language models sentiment analysis retrieval-augmented generation prompt engineering conversational fine-tuning retrieval augmented generation assessment
title	Fine-Tuning Retrieval-Augmented Generation with an Auto-Regressive Language Model for Sentiment Analysis in Financial Reviews
title_full	Fine-Tuning Retrieval-Augmented Generation with an Auto-Regressive Language Model for Sentiment Analysis in Financial Reviews
title_fullStr	Fine-Tuning Retrieval-Augmented Generation with an Auto-Regressive Language Model for Sentiment Analysis in Financial Reviews
title_full_unstemmed	Fine-Tuning Retrieval-Augmented Generation with an Auto-Regressive Language Model for Sentiment Analysis in Financial Reviews
title_short	Fine-Tuning Retrieval-Augmented Generation with an Auto-Regressive Language Model for Sentiment Analysis in Financial Reviews
title_sort	fine tuning retrieval augmented generation with an auto regressive language model for sentiment analysis in financial reviews
topic	large language models sentiment analysis retrieval-augmented generation prompt engineering conversational fine-tuning retrieval augmented generation assessment
url	https://www.mdpi.com/2076-3417/14/23/10782
work_keys_str_mv	AT miehleketomathebula finetuningretrievalaugmentedgenerationwithanautoregressivelanguagemodelforsentimentanalysisinfinancialreviews AT abiodunmodupe finetuningretrievalaugmentedgenerationwithanautoregressivelanguagemodelforsentimentanalysisinfinancialreviews AT vukosimarivate finetuningretrievalaugmentedgenerationwithanautoregressivelanguagemodelforsentimentanalysisinfinancialreviews

Fine-Tuning Retrieval-Augmented Generation with an Auto-Regressive Language Model for Sentiment Analysis in Financial Reviews

Similar Items