Cohesive data analysis for the identification of prognostic hub genes and significant pathways associated with HER2 + and TN breast cancer types

Abstract Breast cancer is the most prevalent and lethal form of cancer being the utmost common medical concern of women. Breast cancer etiology implicates numerous cellular protein receptors such as estrogen receptors (ER), progesterone receptors (PR), and human epidermal growth factor/receptor 2 (H...

Full description

Saved in:
Bibliographic Details
Main Authors: Mahrukh Zakir, Alishbah Saddiqa, Mawara Sheikh, Lalarukh Zakir, Fatima Sami, Faisal Sardar Ahmad, Sadaf Abdul Rauf, Iqra Ali, Zahid Muneer, Wadi B. Alonazi, Abdul Rauf Siddiqi
Format: Article
Language:English
Published: Nature Portfolio 2025-07-01
Series:Scientific Reports
Subjects:
Online Access:https://doi.org/10.1038/s41598-025-94084-0
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849238589733339136
author Mahrukh Zakir
Alishbah Saddiqa
Mawara Sheikh
Lalarukh Zakir
Fatima Sami
Faisal Sardar Ahmad
Sadaf Abdul Rauf
Iqra Ali
Zahid Muneer
Wadi B. Alonazi
Abdul Rauf Siddiqi
author_facet Mahrukh Zakir
Alishbah Saddiqa
Mawara Sheikh
Lalarukh Zakir
Fatima Sami
Faisal Sardar Ahmad
Sadaf Abdul Rauf
Iqra Ali
Zahid Muneer
Wadi B. Alonazi
Abdul Rauf Siddiqi
author_sort Mahrukh Zakir
collection DOAJ
description Abstract Breast cancer is the most prevalent and lethal form of cancer being the utmost common medical concern of women. Breast cancer etiology implicates numerous cellular protein receptors such as estrogen receptors (ER), progesterone receptors (PR), and human epidermal growth factor/receptor 2 (HER2) which turn on oncogenic cascade often attributed to certain genetic variations. Breast Cancer is thus classified into ER + /-, PR + /-, HER2 ± and Triple Negative types. This study seeks to build upon our current knowledge of HER2 + and TNBC BC types to discover novel patterns for diagnosis and prognosis. The study exploits wealth of HER2 + and TNBC transcriptome (RNA Seq) data to elucidate the key hub genes, their associated networks, pathways, stage-wise expression profile, role in prognosis and survival expectancy, and regulatory transcription factors. The study also employs machine learning models including support vector machine (SVM), XGBoost, Random Forest, k nearest neighbor (kNN), Naïve Bayes and Voting Classifier to distinguish between HER2 + and TNBC transcriptomes which is a key variable for early detection and choice of therapeutic alternatives. RNA Seq datasets consisting of 49 HER2 + and 44 TNBC breast tumor samples were retrieved and pre-processed. Differentially Expressed Genes (DEGs) along with their logFC and p-values were fetched. The KEGG (Kyoto Encyclopedia of Genes and Genomes) and GO (Gene Ontology) analyses of DEGs were conducted on DAVID (the Database for Annotation, Visualization and Integrated Discovery) and interaction network was constructed through Cytoscape. Ten hub genes were obtained based on maximum clique centrality (MCC), maximum neighborhood component (MNC), degree, closeness and betweenness using cytoHubba which included ACTB, ATM, ESR1, GAPDH, HNRNPK, KRAS, MDM2, SIRT1, TP53, and H3F3C (H3-5). These hub genes were found to be associated with cell proliferation, invasion and migration. Transcription factors and association of the expression profile of these hub genes with survival expectancy was also determined. Among the ML models, SVM stood out, exhibiting classification success between HER2 + and TNBC transcriptomes with an accuracy of 90%. The findings of this study can therefore effectively aid in tracing the initial prognosis of BC and identify biomarkers for the personalized prevention, prediction, diagnosis, and treatment of BC.
format Article
id doaj-art-0620f784ca7d40c4ae48a6c816c5e294
institution Kabale University
issn 2045-2322
language English
publishDate 2025-07-01
publisher Nature Portfolio
record_format Article
series Scientific Reports
spelling doaj-art-0620f784ca7d40c4ae48a6c816c5e2942025-08-20T04:01:34ZengNature PortfolioScientific Reports2045-23222025-07-0115112510.1038/s41598-025-94084-0Cohesive data analysis for the identification of prognostic hub genes and significant pathways associated with HER2 + and TN breast cancer typesMahrukh Zakir0Alishbah Saddiqa1Mawara Sheikh2Lalarukh Zakir3Fatima Sami4Faisal Sardar Ahmad5Sadaf Abdul Rauf6Iqra Ali7Zahid Muneer8Wadi B. Alonazi9Abdul Rauf Siddiqi10Department of Biosciences, COMSATS UniversityDepartment of Biosciences, COMSATS UniversityPakistan Agriculture Research Council IslamabadAzad Jammu and Kashmir Medical CollegeDepartment of Biosciences, COMSATS UniversityDepartment of Biosciences, COMSATS UniversityFatima Jinnah Women UniversityDepartment of Biosciences, COMSATS UniversityDepartment of Biosciences, COMSATS UniversityHealth Administration Department, College of Business Administration, King Saud UniversityDepartment of Biosciences, COMSATS UniversityAbstract Breast cancer is the most prevalent and lethal form of cancer being the utmost common medical concern of women. Breast cancer etiology implicates numerous cellular protein receptors such as estrogen receptors (ER), progesterone receptors (PR), and human epidermal growth factor/receptor 2 (HER2) which turn on oncogenic cascade often attributed to certain genetic variations. Breast Cancer is thus classified into ER + /-, PR + /-, HER2 ± and Triple Negative types. This study seeks to build upon our current knowledge of HER2 + and TNBC BC types to discover novel patterns for diagnosis and prognosis. The study exploits wealth of HER2 + and TNBC transcriptome (RNA Seq) data to elucidate the key hub genes, their associated networks, pathways, stage-wise expression profile, role in prognosis and survival expectancy, and regulatory transcription factors. The study also employs machine learning models including support vector machine (SVM), XGBoost, Random Forest, k nearest neighbor (kNN), Naïve Bayes and Voting Classifier to distinguish between HER2 + and TNBC transcriptomes which is a key variable for early detection and choice of therapeutic alternatives. RNA Seq datasets consisting of 49 HER2 + and 44 TNBC breast tumor samples were retrieved and pre-processed. Differentially Expressed Genes (DEGs) along with their logFC and p-values were fetched. The KEGG (Kyoto Encyclopedia of Genes and Genomes) and GO (Gene Ontology) analyses of DEGs were conducted on DAVID (the Database for Annotation, Visualization and Integrated Discovery) and interaction network was constructed through Cytoscape. Ten hub genes were obtained based on maximum clique centrality (MCC), maximum neighborhood component (MNC), degree, closeness and betweenness using cytoHubba which included ACTB, ATM, ESR1, GAPDH, HNRNPK, KRAS, MDM2, SIRT1, TP53, and H3F3C (H3-5). These hub genes were found to be associated with cell proliferation, invasion and migration. Transcription factors and association of the expression profile of these hub genes with survival expectancy was also determined. Among the ML models, SVM stood out, exhibiting classification success between HER2 + and TNBC transcriptomes with an accuracy of 90%. The findings of this study can therefore effectively aid in tracing the initial prognosis of BC and identify biomarkers for the personalized prevention, prediction, diagnosis, and treatment of BC.https://doi.org/10.1038/s41598-025-94084-0Breast cancer (BC)Triple negative breast cancer (TNBC)Human epidermal growth factor 2 (HER2 +)Differentially expressed genes (DEGs’)Pathway analysisNetwork analysis
spellingShingle Mahrukh Zakir
Alishbah Saddiqa
Mawara Sheikh
Lalarukh Zakir
Fatima Sami
Faisal Sardar Ahmad
Sadaf Abdul Rauf
Iqra Ali
Zahid Muneer
Wadi B. Alonazi
Abdul Rauf Siddiqi
Cohesive data analysis for the identification of prognostic hub genes and significant pathways associated with HER2 + and TN breast cancer types
Scientific Reports
Breast cancer (BC)
Triple negative breast cancer (TNBC)
Human epidermal growth factor 2 (HER2 +)
Differentially expressed genes (DEGs’)
Pathway analysis
Network analysis
title Cohesive data analysis for the identification of prognostic hub genes and significant pathways associated with HER2 + and TN breast cancer types
title_full Cohesive data analysis for the identification of prognostic hub genes and significant pathways associated with HER2 + and TN breast cancer types
title_fullStr Cohesive data analysis for the identification of prognostic hub genes and significant pathways associated with HER2 + and TN breast cancer types
title_full_unstemmed Cohesive data analysis for the identification of prognostic hub genes and significant pathways associated with HER2 + and TN breast cancer types
title_short Cohesive data analysis for the identification of prognostic hub genes and significant pathways associated with HER2 + and TN breast cancer types
title_sort cohesive data analysis for the identification of prognostic hub genes and significant pathways associated with her2 and tn breast cancer types
topic Breast cancer (BC)
Triple negative breast cancer (TNBC)
Human epidermal growth factor 2 (HER2 +)
Differentially expressed genes (DEGs’)
Pathway analysis
Network analysis
url https://doi.org/10.1038/s41598-025-94084-0
work_keys_str_mv AT mahrukhzakir cohesivedataanalysisfortheidentificationofprognostichubgenesandsignificantpathwaysassociatedwithher2andtnbreastcancertypes
AT alishbahsaddiqa cohesivedataanalysisfortheidentificationofprognostichubgenesandsignificantpathwaysassociatedwithher2andtnbreastcancertypes
AT mawarasheikh cohesivedataanalysisfortheidentificationofprognostichubgenesandsignificantpathwaysassociatedwithher2andtnbreastcancertypes
AT lalarukhzakir cohesivedataanalysisfortheidentificationofprognostichubgenesandsignificantpathwaysassociatedwithher2andtnbreastcancertypes
AT fatimasami cohesivedataanalysisfortheidentificationofprognostichubgenesandsignificantpathwaysassociatedwithher2andtnbreastcancertypes
AT faisalsardarahmad cohesivedataanalysisfortheidentificationofprognostichubgenesandsignificantpathwaysassociatedwithher2andtnbreastcancertypes
AT sadafabdulrauf cohesivedataanalysisfortheidentificationofprognostichubgenesandsignificantpathwaysassociatedwithher2andtnbreastcancertypes
AT iqraali cohesivedataanalysisfortheidentificationofprognostichubgenesandsignificantpathwaysassociatedwithher2andtnbreastcancertypes
AT zahidmuneer cohesivedataanalysisfortheidentificationofprognostichubgenesandsignificantpathwaysassociatedwithher2andtnbreastcancertypes
AT wadibalonazi cohesivedataanalysisfortheidentificationofprognostichubgenesandsignificantpathwaysassociatedwithher2andtnbreastcancertypes
AT abdulraufsiddiqi cohesivedataanalysisfortheidentificationofprognostichubgenesandsignificantpathwaysassociatedwithher2andtnbreastcancertypes