Cohesive data analysis for the identification of prognostic hub genes and significant pathways associated with HER2 + and TN breast cancer types
Abstract Breast cancer is the most prevalent and lethal form of cancer being the utmost common medical concern of women. Breast cancer etiology implicates numerous cellular protein receptors such as estrogen receptors (ER), progesterone receptors (PR), and human epidermal growth factor/receptor 2 (H...
Saved in:
| Main Authors: | , , , , , , , , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
Nature Portfolio
2025-07-01
|
| Series: | Scientific Reports |
| Subjects: | |
| Online Access: | https://doi.org/10.1038/s41598-025-94084-0 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1849238589733339136 |
|---|---|
| author | Mahrukh Zakir Alishbah Saddiqa Mawara Sheikh Lalarukh Zakir Fatima Sami Faisal Sardar Ahmad Sadaf Abdul Rauf Iqra Ali Zahid Muneer Wadi B. Alonazi Abdul Rauf Siddiqi |
| author_facet | Mahrukh Zakir Alishbah Saddiqa Mawara Sheikh Lalarukh Zakir Fatima Sami Faisal Sardar Ahmad Sadaf Abdul Rauf Iqra Ali Zahid Muneer Wadi B. Alonazi Abdul Rauf Siddiqi |
| author_sort | Mahrukh Zakir |
| collection | DOAJ |
| description | Abstract Breast cancer is the most prevalent and lethal form of cancer being the utmost common medical concern of women. Breast cancer etiology implicates numerous cellular protein receptors such as estrogen receptors (ER), progesterone receptors (PR), and human epidermal growth factor/receptor 2 (HER2) which turn on oncogenic cascade often attributed to certain genetic variations. Breast Cancer is thus classified into ER + /-, PR + /-, HER2 ± and Triple Negative types. This study seeks to build upon our current knowledge of HER2 + and TNBC BC types to discover novel patterns for diagnosis and prognosis. The study exploits wealth of HER2 + and TNBC transcriptome (RNA Seq) data to elucidate the key hub genes, their associated networks, pathways, stage-wise expression profile, role in prognosis and survival expectancy, and regulatory transcription factors. The study also employs machine learning models including support vector machine (SVM), XGBoost, Random Forest, k nearest neighbor (kNN), Naïve Bayes and Voting Classifier to distinguish between HER2 + and TNBC transcriptomes which is a key variable for early detection and choice of therapeutic alternatives. RNA Seq datasets consisting of 49 HER2 + and 44 TNBC breast tumor samples were retrieved and pre-processed. Differentially Expressed Genes (DEGs) along with their logFC and p-values were fetched. The KEGG (Kyoto Encyclopedia of Genes and Genomes) and GO (Gene Ontology) analyses of DEGs were conducted on DAVID (the Database for Annotation, Visualization and Integrated Discovery) and interaction network was constructed through Cytoscape. Ten hub genes were obtained based on maximum clique centrality (MCC), maximum neighborhood component (MNC), degree, closeness and betweenness using cytoHubba which included ACTB, ATM, ESR1, GAPDH, HNRNPK, KRAS, MDM2, SIRT1, TP53, and H3F3C (H3-5). These hub genes were found to be associated with cell proliferation, invasion and migration. Transcription factors and association of the expression profile of these hub genes with survival expectancy was also determined. Among the ML models, SVM stood out, exhibiting classification success between HER2 + and TNBC transcriptomes with an accuracy of 90%. The findings of this study can therefore effectively aid in tracing the initial prognosis of BC and identify biomarkers for the personalized prevention, prediction, diagnosis, and treatment of BC. |
| format | Article |
| id | doaj-art-0620f784ca7d40c4ae48a6c816c5e294 |
| institution | Kabale University |
| issn | 2045-2322 |
| language | English |
| publishDate | 2025-07-01 |
| publisher | Nature Portfolio |
| record_format | Article |
| series | Scientific Reports |
| spelling | doaj-art-0620f784ca7d40c4ae48a6c816c5e2942025-08-20T04:01:34ZengNature PortfolioScientific Reports2045-23222025-07-0115112510.1038/s41598-025-94084-0Cohesive data analysis for the identification of prognostic hub genes and significant pathways associated with HER2 + and TN breast cancer typesMahrukh Zakir0Alishbah Saddiqa1Mawara Sheikh2Lalarukh Zakir3Fatima Sami4Faisal Sardar Ahmad5Sadaf Abdul Rauf6Iqra Ali7Zahid Muneer8Wadi B. Alonazi9Abdul Rauf Siddiqi10Department of Biosciences, COMSATS UniversityDepartment of Biosciences, COMSATS UniversityPakistan Agriculture Research Council IslamabadAzad Jammu and Kashmir Medical CollegeDepartment of Biosciences, COMSATS UniversityDepartment of Biosciences, COMSATS UniversityFatima Jinnah Women UniversityDepartment of Biosciences, COMSATS UniversityDepartment of Biosciences, COMSATS UniversityHealth Administration Department, College of Business Administration, King Saud UniversityDepartment of Biosciences, COMSATS UniversityAbstract Breast cancer is the most prevalent and lethal form of cancer being the utmost common medical concern of women. Breast cancer etiology implicates numerous cellular protein receptors such as estrogen receptors (ER), progesterone receptors (PR), and human epidermal growth factor/receptor 2 (HER2) which turn on oncogenic cascade often attributed to certain genetic variations. Breast Cancer is thus classified into ER + /-, PR + /-, HER2 ± and Triple Negative types. This study seeks to build upon our current knowledge of HER2 + and TNBC BC types to discover novel patterns for diagnosis and prognosis. The study exploits wealth of HER2 + and TNBC transcriptome (RNA Seq) data to elucidate the key hub genes, their associated networks, pathways, stage-wise expression profile, role in prognosis and survival expectancy, and regulatory transcription factors. The study also employs machine learning models including support vector machine (SVM), XGBoost, Random Forest, k nearest neighbor (kNN), Naïve Bayes and Voting Classifier to distinguish between HER2 + and TNBC transcriptomes which is a key variable for early detection and choice of therapeutic alternatives. RNA Seq datasets consisting of 49 HER2 + and 44 TNBC breast tumor samples were retrieved and pre-processed. Differentially Expressed Genes (DEGs) along with their logFC and p-values were fetched. The KEGG (Kyoto Encyclopedia of Genes and Genomes) and GO (Gene Ontology) analyses of DEGs were conducted on DAVID (the Database for Annotation, Visualization and Integrated Discovery) and interaction network was constructed through Cytoscape. Ten hub genes were obtained based on maximum clique centrality (MCC), maximum neighborhood component (MNC), degree, closeness and betweenness using cytoHubba which included ACTB, ATM, ESR1, GAPDH, HNRNPK, KRAS, MDM2, SIRT1, TP53, and H3F3C (H3-5). These hub genes were found to be associated with cell proliferation, invasion and migration. Transcription factors and association of the expression profile of these hub genes with survival expectancy was also determined. Among the ML models, SVM stood out, exhibiting classification success between HER2 + and TNBC transcriptomes with an accuracy of 90%. The findings of this study can therefore effectively aid in tracing the initial prognosis of BC and identify biomarkers for the personalized prevention, prediction, diagnosis, and treatment of BC.https://doi.org/10.1038/s41598-025-94084-0Breast cancer (BC)Triple negative breast cancer (TNBC)Human epidermal growth factor 2 (HER2 +)Differentially expressed genes (DEGs’)Pathway analysisNetwork analysis |
| spellingShingle | Mahrukh Zakir Alishbah Saddiqa Mawara Sheikh Lalarukh Zakir Fatima Sami Faisal Sardar Ahmad Sadaf Abdul Rauf Iqra Ali Zahid Muneer Wadi B. Alonazi Abdul Rauf Siddiqi Cohesive data analysis for the identification of prognostic hub genes and significant pathways associated with HER2 + and TN breast cancer types Scientific Reports Breast cancer (BC) Triple negative breast cancer (TNBC) Human epidermal growth factor 2 (HER2 +) Differentially expressed genes (DEGs’) Pathway analysis Network analysis |
| title | Cohesive data analysis for the identification of prognostic hub genes and significant pathways associated with HER2 + and TN breast cancer types |
| title_full | Cohesive data analysis for the identification of prognostic hub genes and significant pathways associated with HER2 + and TN breast cancer types |
| title_fullStr | Cohesive data analysis for the identification of prognostic hub genes and significant pathways associated with HER2 + and TN breast cancer types |
| title_full_unstemmed | Cohesive data analysis for the identification of prognostic hub genes and significant pathways associated with HER2 + and TN breast cancer types |
| title_short | Cohesive data analysis for the identification of prognostic hub genes and significant pathways associated with HER2 + and TN breast cancer types |
| title_sort | cohesive data analysis for the identification of prognostic hub genes and significant pathways associated with her2 and tn breast cancer types |
| topic | Breast cancer (BC) Triple negative breast cancer (TNBC) Human epidermal growth factor 2 (HER2 +) Differentially expressed genes (DEGs’) Pathway analysis Network analysis |
| url | https://doi.org/10.1038/s41598-025-94084-0 |
| work_keys_str_mv | AT mahrukhzakir cohesivedataanalysisfortheidentificationofprognostichubgenesandsignificantpathwaysassociatedwithher2andtnbreastcancertypes AT alishbahsaddiqa cohesivedataanalysisfortheidentificationofprognostichubgenesandsignificantpathwaysassociatedwithher2andtnbreastcancertypes AT mawarasheikh cohesivedataanalysisfortheidentificationofprognostichubgenesandsignificantpathwaysassociatedwithher2andtnbreastcancertypes AT lalarukhzakir cohesivedataanalysisfortheidentificationofprognostichubgenesandsignificantpathwaysassociatedwithher2andtnbreastcancertypes AT fatimasami cohesivedataanalysisfortheidentificationofprognostichubgenesandsignificantpathwaysassociatedwithher2andtnbreastcancertypes AT faisalsardarahmad cohesivedataanalysisfortheidentificationofprognostichubgenesandsignificantpathwaysassociatedwithher2andtnbreastcancertypes AT sadafabdulrauf cohesivedataanalysisfortheidentificationofprognostichubgenesandsignificantpathwaysassociatedwithher2andtnbreastcancertypes AT iqraali cohesivedataanalysisfortheidentificationofprognostichubgenesandsignificantpathwaysassociatedwithher2andtnbreastcancertypes AT zahidmuneer cohesivedataanalysisfortheidentificationofprognostichubgenesandsignificantpathwaysassociatedwithher2andtnbreastcancertypes AT wadibalonazi cohesivedataanalysisfortheidentificationofprognostichubgenesandsignificantpathwaysassociatedwithher2andtnbreastcancertypes AT abdulraufsiddiqi cohesivedataanalysisfortheidentificationofprognostichubgenesandsignificantpathwaysassociatedwithher2andtnbreastcancertypes |