RMPT: Reinforced Memory-Driven Pure Transformer for Automatic Chest X-Ray Report Generation

Automatic generation of chest X-ray reports, designed to produce clinically precise descriptions from chest X-ray images, is gaining significant research attention because of its vast potential in clinical applications. Recently, despite considerable progress, current models typically adhere to a CN...

Full description

Saved in:
Bibliographic Details
Main Authors: Caijie Qin, Yize Xiong, Weibin Chen, Yong Li
Format: Article
Language:English
Published: MDPI AG 2025-04-01
Series:Mathematics
Subjects:
Online Access:https://www.mdpi.com/2227-7390/13/9/1492
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849322445216940032
author Caijie Qin
Yize Xiong
Weibin Chen
Yong Li
author_facet Caijie Qin
Yize Xiong
Weibin Chen
Yong Li
author_sort Caijie Qin
collection DOAJ
description Automatic generation of chest X-ray reports, designed to produce clinically precise descriptions from chest X-ray images, is gaining significant research attention because of its vast potential in clinical applications. Recently, despite considerable progress, current models typically adhere to a CNN–Transformer-based framework, which still fails to enhance the perceptual field during image feature extraction. To solve this problem, we propose the Reinforced Memory-driven Pure Transformer (RMPT), which is a novel Transformer–Transformer-based model. In implementation, our RMPT employs the Swin Transformer to extract visual features from given X-ray images, which has a larger perceptual field to better model the relationships between different regions. Furthermore, we adopt a memory-driven Transformer (MemTrans) to effectively model similar patterns in different reports, which is able to facilitate the model to generate long reports. Finally, we present an innovative training approach leveraging Reinforcement Learning (RL) that efficiently steers the model to focus on challenging samples, consequently improving its comprehensive performance across both straightforward and complex situations. Experimental results on the IU X-ray dataset show that our proposed RMPT achieves superior performance on various Natural Language Generation (NLG) evaluation metrics. Further ablation study results demonstrate that our RMPT model achieves 10.5% overall performance compared to the base mode.
format Article
id doaj-art-4f08618ec5c9411fb6ff738e42e9b29d
institution Kabale University
issn 2227-7390
language English
publishDate 2025-04-01
publisher MDPI AG
record_format Article
series Mathematics
spelling doaj-art-4f08618ec5c9411fb6ff738e42e9b29d2025-08-20T03:49:22ZengMDPI AGMathematics2227-73902025-04-01139149210.3390/math13091492RMPT: Reinforced Memory-Driven Pure Transformer for Automatic Chest X-Ray Report GenerationCaijie Qin0Yize Xiong1Weibin Chen2Yong Li3Institute of Information Engineering, Sanming University, Sanming 365004, ChinaInstitute of Information Engineering, Sanming University, Sanming 365004, ChinaQingdao Nuocheng Chemicals Safty Technology Co., Ltd., Qingdao 266071, ChinaInstitute of Information Engineering, Sanming University, Sanming 365004, ChinaAutomatic generation of chest X-ray reports, designed to produce clinically precise descriptions from chest X-ray images, is gaining significant research attention because of its vast potential in clinical applications. Recently, despite considerable progress, current models typically adhere to a CNN–Transformer-based framework, which still fails to enhance the perceptual field during image feature extraction. To solve this problem, we propose the Reinforced Memory-driven Pure Transformer (RMPT), which is a novel Transformer–Transformer-based model. In implementation, our RMPT employs the Swin Transformer to extract visual features from given X-ray images, which has a larger perceptual field to better model the relationships between different regions. Furthermore, we adopt a memory-driven Transformer (MemTrans) to effectively model similar patterns in different reports, which is able to facilitate the model to generate long reports. Finally, we present an innovative training approach leveraging Reinforcement Learning (RL) that efficiently steers the model to focus on challenging samples, consequently improving its comprehensive performance across both straightforward and complex situations. Experimental results on the IU X-ray dataset show that our proposed RMPT achieves superior performance on various Natural Language Generation (NLG) evaluation metrics. Further ablation study results demonstrate that our RMPT model achieves 10.5% overall performance compared to the base mode.https://www.mdpi.com/2227-7390/13/9/1492chest X-ray report generationtransformerimage-to-textreinforcement learning
spellingShingle Caijie Qin
Yize Xiong
Weibin Chen
Yong Li
RMPT: Reinforced Memory-Driven Pure Transformer for Automatic Chest X-Ray Report Generation
Mathematics
chest X-ray report generation
transformer
image-to-text
reinforcement learning
title RMPT: Reinforced Memory-Driven Pure Transformer for Automatic Chest X-Ray Report Generation
title_full RMPT: Reinforced Memory-Driven Pure Transformer for Automatic Chest X-Ray Report Generation
title_fullStr RMPT: Reinforced Memory-Driven Pure Transformer for Automatic Chest X-Ray Report Generation
title_full_unstemmed RMPT: Reinforced Memory-Driven Pure Transformer for Automatic Chest X-Ray Report Generation
title_short RMPT: Reinforced Memory-Driven Pure Transformer for Automatic Chest X-Ray Report Generation
title_sort rmpt reinforced memory driven pure transformer for automatic chest x ray report generation
topic chest X-ray report generation
transformer
image-to-text
reinforcement learning
url https://www.mdpi.com/2227-7390/13/9/1492
work_keys_str_mv AT caijieqin rmptreinforcedmemorydrivenpuretransformerforautomaticchestxrayreportgeneration
AT yizexiong rmptreinforcedmemorydrivenpuretransformerforautomaticchestxrayreportgeneration
AT weibinchen rmptreinforcedmemorydrivenpuretransformerforautomaticchestxrayreportgeneration
AT yongli rmptreinforcedmemorydrivenpuretransformerforautomaticchestxrayreportgeneration