Enhancing Persian text summarization through a three-phase fine-tuning and reinforcement learning approach with the mT5 transformer model

Abstract In the contemporary era, grappling with the vast expanse of big data presents a formidable obstacle, particularly when it comes to extracting vital information from extensive textual sources. The constant influx of news articles from various agencies necessitates an enormous amount of time...

Full description

Saved in:

Bibliographic Details
Main Authors:	Vahid Nejad Mahmood Abadi, Fahimeh Ghasemian
Format:	Article
Language:	English
Published:	Nature Portfolio 2025-01-01
Series:	Scientific Reports
Online Access:	https://doi.org/10.1038/s41598-024-78235-3
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1841559637931851776
author	Vahid Nejad Mahmood Abadi Fahimeh Ghasemian
author_facet	Vahid Nejad Mahmood Abadi Fahimeh Ghasemian
author_sort	Vahid Nejad Mahmood Abadi
collection	DOAJ
description	Abstract In the contemporary era, grappling with the vast expanse of big data presents a formidable obstacle, particularly when it comes to extracting vital information from extensive textual sources. The constant influx of news articles from various agencies necessitates an enormous amount of time to digest comprehensively. A viable solution to address this challenge lies in the realm of automatic text summarization, which is a pivotal and intricate endeavor within the field of natural language processing. Text summarization involves transforming pertinent textual content into a concise format that reduces its word count without compromising its underlying meaning. In recent years, transformers have emerged as a prominent force in the landscape of natural language processing, particularly in the realm of text summarization. This research endeavors to harness the power of transformers by training the mT5-base model on a three-step fine-tuning phase on Persian news articles. Subsequently, reinforcement learning via the PPO algorithm is integrated with the fine-tuned model. Finally, we evaluate the model’s performance in summarizing Persian texts, shedding light on its efficacy in addressing the formidable task of distilling meaningful insights from a sea of textual data. Our model has set a new benchmark in the field of Persian text summarization, achieving outstanding ROUGE scores of 53.17 for ROUGE-1, 37.12 for ROUGE-2, and 44.13 for ROUGE-L. These remarkable results reflect a significant advancement in the quality of Persian text summarization, signaling a promising era of more refined and context-aware summaries.
format	Article
id	doaj-art-129d5513a0f94f2d8aacf93f79645376
institution	Kabale University
issn	2045-2322
language	English
publishDate	2025-01-01
publisher	Nature Portfolio
record_format	Article
series	Scientific Reports
spelling	doaj-art-129d5513a0f94f2d8aacf93f796453762025-01-05T12:16:52ZengNature PortfolioScientific Reports2045-23222025-01-0115111110.1038/s41598-024-78235-3Enhancing Persian text summarization through a three-phase fine-tuning and reinforcement learning approach with the mT5 transformer modelVahid Nejad Mahmood Abadi0Fahimeh Ghasemian1Department of Computer Engineering, Faculty of Engineering, Shahid Bahonar University of KermanDepartment of Computer Engineering, Faculty of Engineering, Shahid Bahonar University of KermanAbstract In the contemporary era, grappling with the vast expanse of big data presents a formidable obstacle, particularly when it comes to extracting vital information from extensive textual sources. The constant influx of news articles from various agencies necessitates an enormous amount of time to digest comprehensively. A viable solution to address this challenge lies in the realm of automatic text summarization, which is a pivotal and intricate endeavor within the field of natural language processing. Text summarization involves transforming pertinent textual content into a concise format that reduces its word count without compromising its underlying meaning. In recent years, transformers have emerged as a prominent force in the landscape of natural language processing, particularly in the realm of text summarization. This research endeavors to harness the power of transformers by training the mT5-base model on a three-step fine-tuning phase on Persian news articles. Subsequently, reinforcement learning via the PPO algorithm is integrated with the fine-tuned model. Finally, we evaluate the model’s performance in summarizing Persian texts, shedding light on its efficacy in addressing the formidable task of distilling meaningful insights from a sea of textual data. Our model has set a new benchmark in the field of Persian text summarization, achieving outstanding ROUGE scores of 53.17 for ROUGE-1, 37.12 for ROUGE-2, and 44.13 for ROUGE-L. These remarkable results reflect a significant advancement in the quality of Persian text summarization, signaling a promising era of more refined and context-aware summaries.https://doi.org/10.1038/s41598-024-78235-3
spellingShingle	Vahid Nejad Mahmood Abadi Fahimeh Ghasemian Enhancing Persian text summarization through a three-phase fine-tuning and reinforcement learning approach with the mT5 transformer model Scientific Reports
title	Enhancing Persian text summarization through a three-phase fine-tuning and reinforcement learning approach with the mT5 transformer model
title_full	Enhancing Persian text summarization through a three-phase fine-tuning and reinforcement learning approach with the mT5 transformer model
title_fullStr	Enhancing Persian text summarization through a three-phase fine-tuning and reinforcement learning approach with the mT5 transformer model
title_full_unstemmed	Enhancing Persian text summarization through a three-phase fine-tuning and reinforcement learning approach with the mT5 transformer model
title_short	Enhancing Persian text summarization through a three-phase fine-tuning and reinforcement learning approach with the mT5 transformer model
title_sort	enhancing persian text summarization through a three phase fine tuning and reinforcement learning approach with the mt5 transformer model
url	https://doi.org/10.1038/s41598-024-78235-3
work_keys_str_mv	AT vahidnejadmahmoodabadi enhancingpersiantextsummarizationthroughathreephasefinetuningandreinforcementlearningapproachwiththemt5transformermodel AT fahimehghasemian enhancingpersiantextsummarizationthroughathreephasefinetuningandreinforcementlearningapproachwiththemt5transformermodel

Enhancing Persian text summarization through a three-phase fine-tuning and reinforcement learning approach with the mT5 transformer model

Similar Items