Enhancing Persian text summarization through a three-phase fine-tuning and reinforcement learning approach with the mT5 transformer model
Abstract In the contemporary era, grappling with the vast expanse of big data presents a formidable obstacle, particularly when it comes to extracting vital information from extensive textual sources. The constant influx of news articles from various agencies necessitates an enormous amount of time...
Saved in:
Main Authors: | , |
---|---|
Format: | Article |
Language: | English |
Published: |
Nature Portfolio
2025-01-01
|
Series: | Scientific Reports |
Online Access: | https://doi.org/10.1038/s41598-024-78235-3 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
_version_ | 1841559637931851776 |
---|---|
author | Vahid Nejad Mahmood Abadi Fahimeh Ghasemian |
author_facet | Vahid Nejad Mahmood Abadi Fahimeh Ghasemian |
author_sort | Vahid Nejad Mahmood Abadi |
collection | DOAJ |
description | Abstract In the contemporary era, grappling with the vast expanse of big data presents a formidable obstacle, particularly when it comes to extracting vital information from extensive textual sources. The constant influx of news articles from various agencies necessitates an enormous amount of time to digest comprehensively. A viable solution to address this challenge lies in the realm of automatic text summarization, which is a pivotal and intricate endeavor within the field of natural language processing. Text summarization involves transforming pertinent textual content into a concise format that reduces its word count without compromising its underlying meaning. In recent years, transformers have emerged as a prominent force in the landscape of natural language processing, particularly in the realm of text summarization. This research endeavors to harness the power of transformers by training the mT5-base model on a three-step fine-tuning phase on Persian news articles. Subsequently, reinforcement learning via the PPO algorithm is integrated with the fine-tuned model. Finally, we evaluate the model’s performance in summarizing Persian texts, shedding light on its efficacy in addressing the formidable task of distilling meaningful insights from a sea of textual data. Our model has set a new benchmark in the field of Persian text summarization, achieving outstanding ROUGE scores of 53.17 for ROUGE-1, 37.12 for ROUGE-2, and 44.13 for ROUGE-L. These remarkable results reflect a significant advancement in the quality of Persian text summarization, signaling a promising era of more refined and context-aware summaries. |
format | Article |
id | doaj-art-129d5513a0f94f2d8aacf93f79645376 |
institution | Kabale University |
issn | 2045-2322 |
language | English |
publishDate | 2025-01-01 |
publisher | Nature Portfolio |
record_format | Article |
series | Scientific Reports |
spelling | doaj-art-129d5513a0f94f2d8aacf93f796453762025-01-05T12:16:52ZengNature PortfolioScientific Reports2045-23222025-01-0115111110.1038/s41598-024-78235-3Enhancing Persian text summarization through a three-phase fine-tuning and reinforcement learning approach with the mT5 transformer modelVahid Nejad Mahmood Abadi0Fahimeh Ghasemian1Department of Computer Engineering, Faculty of Engineering, Shahid Bahonar University of KermanDepartment of Computer Engineering, Faculty of Engineering, Shahid Bahonar University of KermanAbstract In the contemporary era, grappling with the vast expanse of big data presents a formidable obstacle, particularly when it comes to extracting vital information from extensive textual sources. The constant influx of news articles from various agencies necessitates an enormous amount of time to digest comprehensively. A viable solution to address this challenge lies in the realm of automatic text summarization, which is a pivotal and intricate endeavor within the field of natural language processing. Text summarization involves transforming pertinent textual content into a concise format that reduces its word count without compromising its underlying meaning. In recent years, transformers have emerged as a prominent force in the landscape of natural language processing, particularly in the realm of text summarization. This research endeavors to harness the power of transformers by training the mT5-base model on a three-step fine-tuning phase on Persian news articles. Subsequently, reinforcement learning via the PPO algorithm is integrated with the fine-tuned model. Finally, we evaluate the model’s performance in summarizing Persian texts, shedding light on its efficacy in addressing the formidable task of distilling meaningful insights from a sea of textual data. Our model has set a new benchmark in the field of Persian text summarization, achieving outstanding ROUGE scores of 53.17 for ROUGE-1, 37.12 for ROUGE-2, and 44.13 for ROUGE-L. These remarkable results reflect a significant advancement in the quality of Persian text summarization, signaling a promising era of more refined and context-aware summaries.https://doi.org/10.1038/s41598-024-78235-3 |
spellingShingle | Vahid Nejad Mahmood Abadi Fahimeh Ghasemian Enhancing Persian text summarization through a three-phase fine-tuning and reinforcement learning approach with the mT5 transformer model Scientific Reports |
title | Enhancing Persian text summarization through a three-phase fine-tuning and reinforcement learning approach with the mT5 transformer model |
title_full | Enhancing Persian text summarization through a three-phase fine-tuning and reinforcement learning approach with the mT5 transformer model |
title_fullStr | Enhancing Persian text summarization through a three-phase fine-tuning and reinforcement learning approach with the mT5 transformer model |
title_full_unstemmed | Enhancing Persian text summarization through a three-phase fine-tuning and reinforcement learning approach with the mT5 transformer model |
title_short | Enhancing Persian text summarization through a three-phase fine-tuning and reinforcement learning approach with the mT5 transformer model |
title_sort | enhancing persian text summarization through a three phase fine tuning and reinforcement learning approach with the mt5 transformer model |
url | https://doi.org/10.1038/s41598-024-78235-3 |
work_keys_str_mv | AT vahidnejadmahmoodabadi enhancingpersiantextsummarizationthroughathreephasefinetuningandreinforcementlearningapproachwiththemt5transformermodel AT fahimehghasemian enhancingpersiantextsummarizationthroughathreephasefinetuningandreinforcementlearningapproachwiththemt5transformermodel |