Exploration of biomarkers for predicting the prognosis of patients with diffuse large B-cell lymphoma by machine-learning analysis

Abstract Background As one distinct origin of hematological malignancies, diffuse large B-cell lymphoma (DLBCL) has caused a major public health problem. However, the molecular mechanisms that underlie this association have not been clearly elucidated. To improve this situation, it is urgent to expl...

Full description

Saved in:
Bibliographic Details
Main Authors: Shifen Wang, Hong Tao, Xingyun Zhao, Siwen Wu, Chunwei Yang, Yuanfei Shi, Zhenshu Xu, Dawei Cui
Format: Article
Language:English
Published: BMC 2025-08-01
Series:BMC Immunology
Subjects:
Online Access:https://doi.org/10.1186/s12865-025-00738-z
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Abstract Background As one distinct origin of hematological malignancies, diffuse large B-cell lymphoma (DLBCL) has caused a major public health problem. However, the molecular mechanisms that underlie this association have not been clearly elucidated. To improve this situation, it is urgent to explore disease-specific diagnostic biomarkers and mechanisms. Methods Three microarray datasets (GSE25638, GSE12195 and GSE12453) were downloaded from the Gene Expression Omnibus (GEO) database. The key genes in DLBCL patients were screened by differentially expression gene (DEG) and weighted gene co-expression network analysis (WGCNA). Functional enrichment analysis and protein-protein interaction (PPI) network construction were employed to reveal DLBCL-related pathogenic molecules and underlying mechanisms. Candidate biomarkers were screened using random forest (RF) analysis. A diagnostic nomogram and Kaplan-Meier (KM) survival analysis were constructed to predict the risk of patients. Single-sample gene set enrichment analysis (ssGSEA) was used for exploring immune cell infiltration in lymphoma. The validation of the hub genes expressions was confirmed by quantitative real-time polymerase chain reaction (qRT-PCR) and immunohistochemistry (IHC) tests. Results A total of 95 key genes were acquired from three datasets of DLBCL patients by DEG analysis and WGCNA. DEGs were significantly enriched in pathways associated with inflammatory response, biological process involved in interspecies interaction between organisms, C-X-C chemokine receptor binding as well as chemokine activity. This was determined by Kyoto Encyclopedia of Genes and Genomes (KEGG) and Gene Ontology (GO) analyses. Moreover, four hub genes (CXCL9, CCL18, C1QA and CTSC) were significantly screened from the three datasets using RF algorithms. They were closely correlated with the overall survival of DLBCL patients. The dysregulated infiltration of immune cells, including natural killer (NK) cells and T cells, were positively linked to the expression levels of the four hub genes. The receiver operating characteristic (ROC) results were promising via the construction of a nomogram model. Additionally, the increased expression of the four key genes was further verified in DLBCL patients. Conclusion Four crucial hub genes (CXCL9, CCL18, C1QA and CTSC) that could predict the risk of DLBCL were systematically identified. In particular, CXCL9 may be the most important potential biomarker for the progression of DLBCL patients.
ISSN:1471-2172