RMPT: Reinforced Memory-Driven Pure Transformer for Automatic Chest X-Ray Report Generation
Automatic generation of chest X-ray reports, designed to produce clinically precise descriptions from chest X-ray images, is gaining significant research attention because of its vast potential in clinical applications. Recently, despite considerable progress, current models typically adhere to a CN...
Saved in:
| Main Authors: | , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
MDPI AG
2025-04-01
|
| Series: | Mathematics |
| Subjects: | |
| Online Access: | https://www.mdpi.com/2227-7390/13/9/1492 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1849322445216940032 |
|---|---|
| author | Caijie Qin Yize Xiong Weibin Chen Yong Li |
| author_facet | Caijie Qin Yize Xiong Weibin Chen Yong Li |
| author_sort | Caijie Qin |
| collection | DOAJ |
| description | Automatic generation of chest X-ray reports, designed to produce clinically precise descriptions from chest X-ray images, is gaining significant research attention because of its vast potential in clinical applications. Recently, despite considerable progress, current models typically adhere to a CNN–Transformer-based framework, which still fails to enhance the perceptual field during image feature extraction. To solve this problem, we propose the Reinforced Memory-driven Pure Transformer (RMPT), which is a novel Transformer–Transformer-based model. In implementation, our RMPT employs the Swin Transformer to extract visual features from given X-ray images, which has a larger perceptual field to better model the relationships between different regions. Furthermore, we adopt a memory-driven Transformer (MemTrans) to effectively model similar patterns in different reports, which is able to facilitate the model to generate long reports. Finally, we present an innovative training approach leveraging Reinforcement Learning (RL) that efficiently steers the model to focus on challenging samples, consequently improving its comprehensive performance across both straightforward and complex situations. Experimental results on the IU X-ray dataset show that our proposed RMPT achieves superior performance on various Natural Language Generation (NLG) evaluation metrics. Further ablation study results demonstrate that our RMPT model achieves 10.5% overall performance compared to the base mode. |
| format | Article |
| id | doaj-art-4f08618ec5c9411fb6ff738e42e9b29d |
| institution | Kabale University |
| issn | 2227-7390 |
| language | English |
| publishDate | 2025-04-01 |
| publisher | MDPI AG |
| record_format | Article |
| series | Mathematics |
| spelling | doaj-art-4f08618ec5c9411fb6ff738e42e9b29d2025-08-20T03:49:22ZengMDPI AGMathematics2227-73902025-04-01139149210.3390/math13091492RMPT: Reinforced Memory-Driven Pure Transformer for Automatic Chest X-Ray Report GenerationCaijie Qin0Yize Xiong1Weibin Chen2Yong Li3Institute of Information Engineering, Sanming University, Sanming 365004, ChinaInstitute of Information Engineering, Sanming University, Sanming 365004, ChinaQingdao Nuocheng Chemicals Safty Technology Co., Ltd., Qingdao 266071, ChinaInstitute of Information Engineering, Sanming University, Sanming 365004, ChinaAutomatic generation of chest X-ray reports, designed to produce clinically precise descriptions from chest X-ray images, is gaining significant research attention because of its vast potential in clinical applications. Recently, despite considerable progress, current models typically adhere to a CNN–Transformer-based framework, which still fails to enhance the perceptual field during image feature extraction. To solve this problem, we propose the Reinforced Memory-driven Pure Transformer (RMPT), which is a novel Transformer–Transformer-based model. In implementation, our RMPT employs the Swin Transformer to extract visual features from given X-ray images, which has a larger perceptual field to better model the relationships between different regions. Furthermore, we adopt a memory-driven Transformer (MemTrans) to effectively model similar patterns in different reports, which is able to facilitate the model to generate long reports. Finally, we present an innovative training approach leveraging Reinforcement Learning (RL) that efficiently steers the model to focus on challenging samples, consequently improving its comprehensive performance across both straightforward and complex situations. Experimental results on the IU X-ray dataset show that our proposed RMPT achieves superior performance on various Natural Language Generation (NLG) evaluation metrics. Further ablation study results demonstrate that our RMPT model achieves 10.5% overall performance compared to the base mode.https://www.mdpi.com/2227-7390/13/9/1492chest X-ray report generationtransformerimage-to-textreinforcement learning |
| spellingShingle | Caijie Qin Yize Xiong Weibin Chen Yong Li RMPT: Reinforced Memory-Driven Pure Transformer for Automatic Chest X-Ray Report Generation Mathematics chest X-ray report generation transformer image-to-text reinforcement learning |
| title | RMPT: Reinforced Memory-Driven Pure Transformer for Automatic Chest X-Ray Report Generation |
| title_full | RMPT: Reinforced Memory-Driven Pure Transformer for Automatic Chest X-Ray Report Generation |
| title_fullStr | RMPT: Reinforced Memory-Driven Pure Transformer for Automatic Chest X-Ray Report Generation |
| title_full_unstemmed | RMPT: Reinforced Memory-Driven Pure Transformer for Automatic Chest X-Ray Report Generation |
| title_short | RMPT: Reinforced Memory-Driven Pure Transformer for Automatic Chest X-Ray Report Generation |
| title_sort | rmpt reinforced memory driven pure transformer for automatic chest x ray report generation |
| topic | chest X-ray report generation transformer image-to-text reinforcement learning |
| url | https://www.mdpi.com/2227-7390/13/9/1492 |
| work_keys_str_mv | AT caijieqin rmptreinforcedmemorydrivenpuretransformerforautomaticchestxrayreportgeneration AT yizexiong rmptreinforcedmemorydrivenpuretransformerforautomaticchestxrayreportgeneration AT weibinchen rmptreinforcedmemorydrivenpuretransformerforautomaticchestxrayreportgeneration AT yongli rmptreinforcedmemorydrivenpuretransformerforautomaticchestxrayreportgeneration |