An automated privacy-preserving self-supervised classification of COVID-19 from lung CT scan images minimizing the requirements of large data annotation
Abstract This study presents a novel privacy-preserving self-supervised (SSL) framework for COVID-19 classification from lung CT scans, utilizing federated learning (FL) enhanced with Paillier homomorphic encryption (PHE) to prevent third-party attacks during training. The FL-SSL based framework emp...
Saved in:
Main Authors: | , , , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Nature Portfolio
2025-01-01
|
Series: | Scientific Reports |
Subjects: | |
Online Access: | https://doi.org/10.1038/s41598-024-83972-6 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
_version_ | 1841559653383667712 |
---|---|
author | Sadia Sultana Chowa Md Rahad Islam Bhuiyan Mst. Sazia Tahosin Asif Karim Sidratul Montaha Md. Mehedi Hassan Mohd Asif Shah Sami Azam |
author_facet | Sadia Sultana Chowa Md Rahad Islam Bhuiyan Mst. Sazia Tahosin Asif Karim Sidratul Montaha Md. Mehedi Hassan Mohd Asif Shah Sami Azam |
author_sort | Sadia Sultana Chowa |
collection | DOAJ |
description | Abstract This study presents a novel privacy-preserving self-supervised (SSL) framework for COVID-19 classification from lung CT scans, utilizing federated learning (FL) enhanced with Paillier homomorphic encryption (PHE) to prevent third-party attacks during training. The FL-SSL based framework employs two publicly available lung CT scan datasets which are considered as labeled and an unlabeled dataset. The unlabeled dataset is split into three subsets which are assumed to be collected from three hospitals. Training is done using the Bootstrap Your Own Latent (BYOL) contrastive learning SSL framework with a VGG19 encoder followed by attention CNN blocks (VGG19 + attention CNN). The input datasets are processed by selecting the largest lung portion of each lung CT scan using an automated selection approach and a 64 × 64 input size is utilized to reduce computational complexity. Healthcare privacy issues are addressed by collaborative training across decentralized datasets and secure aggregation with PHE, underscoring the effectiveness of this approach. Three subsets of the dataset are used to train the local BYOL model, which together optimizes the central encoder. The labeled dataset is employed to train the central encoder (updated VGG19 + attention CNN), resulting in an accuracy of 97.19%, a precision of 97.43%, and a recall of 98.18%. The reliability of the framework’s performance is demonstrated through statistical analysis and five-fold cross-validation. The efficacy of the proposed framework is further showcased by showing its performance on three distinct modality datasets: skin cancer, breast cancer, and chest X-rays. In conclusion, this study offers a promising solution for accurate diagnosis of chest X-rays, preserving privacy and overcoming the challenges of dataset scarcity and computational complexity. |
format | Article |
id | doaj-art-406c4126128f4f169149a25e4134699f |
institution | Kabale University |
issn | 2045-2322 |
language | English |
publishDate | 2025-01-01 |
publisher | Nature Portfolio |
record_format | Article |
series | Scientific Reports |
spelling | doaj-art-406c4126128f4f169149a25e4134699f2025-01-05T12:18:21ZengNature PortfolioScientific Reports2045-23222025-01-0115112010.1038/s41598-024-83972-6An automated privacy-preserving self-supervised classification of COVID-19 from lung CT scan images minimizing the requirements of large data annotationSadia Sultana Chowa0Md Rahad Islam Bhuiyan1Mst. Sazia Tahosin2Asif Karim3Sidratul Montaha4Md. Mehedi Hassan5Mohd Asif Shah6Sami Azam7Faculty of Science and Technology, Charles Darwin UniversityFaculty of Science and Technology, Charles Darwin UniversityHealth Informatics Research Laboratory (HIRL), Department of Computer Science and Engineering, Daffodil International UniversityFaculty of Science and Technology, Charles Darwin UniversityDepartment of Computer Science, University of CalgaryComputer Science and Engineering Discipline, Khulna UniversityDepartment of Economics, Bakhtar UniversityFaculty of Science and Technology, Charles Darwin UniversityAbstract This study presents a novel privacy-preserving self-supervised (SSL) framework for COVID-19 classification from lung CT scans, utilizing federated learning (FL) enhanced with Paillier homomorphic encryption (PHE) to prevent third-party attacks during training. The FL-SSL based framework employs two publicly available lung CT scan datasets which are considered as labeled and an unlabeled dataset. The unlabeled dataset is split into three subsets which are assumed to be collected from three hospitals. Training is done using the Bootstrap Your Own Latent (BYOL) contrastive learning SSL framework with a VGG19 encoder followed by attention CNN blocks (VGG19 + attention CNN). The input datasets are processed by selecting the largest lung portion of each lung CT scan using an automated selection approach and a 64 × 64 input size is utilized to reduce computational complexity. Healthcare privacy issues are addressed by collaborative training across decentralized datasets and secure aggregation with PHE, underscoring the effectiveness of this approach. Three subsets of the dataset are used to train the local BYOL model, which together optimizes the central encoder. The labeled dataset is employed to train the central encoder (updated VGG19 + attention CNN), resulting in an accuracy of 97.19%, a precision of 97.43%, and a recall of 98.18%. The reliability of the framework’s performance is demonstrated through statistical analysis and five-fold cross-validation. The efficacy of the proposed framework is further showcased by showing its performance on three distinct modality datasets: skin cancer, breast cancer, and chest X-rays. In conclusion, this study offers a promising solution for accurate diagnosis of chest X-rays, preserving privacy and overcoming the challenges of dataset scarcity and computational complexity.https://doi.org/10.1038/s41598-024-83972-6Self-supervised learningContrastive learningVGG-19Attention-CNNFederated learningPrivacy-preserving |
spellingShingle | Sadia Sultana Chowa Md Rahad Islam Bhuiyan Mst. Sazia Tahosin Asif Karim Sidratul Montaha Md. Mehedi Hassan Mohd Asif Shah Sami Azam An automated privacy-preserving self-supervised classification of COVID-19 from lung CT scan images minimizing the requirements of large data annotation Scientific Reports Self-supervised learning Contrastive learning VGG-19 Attention-CNN Federated learning Privacy-preserving |
title | An automated privacy-preserving self-supervised classification of COVID-19 from lung CT scan images minimizing the requirements of large data annotation |
title_full | An automated privacy-preserving self-supervised classification of COVID-19 from lung CT scan images minimizing the requirements of large data annotation |
title_fullStr | An automated privacy-preserving self-supervised classification of COVID-19 from lung CT scan images minimizing the requirements of large data annotation |
title_full_unstemmed | An automated privacy-preserving self-supervised classification of COVID-19 from lung CT scan images minimizing the requirements of large data annotation |
title_short | An automated privacy-preserving self-supervised classification of COVID-19 from lung CT scan images minimizing the requirements of large data annotation |
title_sort | automated privacy preserving self supervised classification of covid 19 from lung ct scan images minimizing the requirements of large data annotation |
topic | Self-supervised learning Contrastive learning VGG-19 Attention-CNN Federated learning Privacy-preserving |
url | https://doi.org/10.1038/s41598-024-83972-6 |
work_keys_str_mv | AT sadiasultanachowa anautomatedprivacypreservingselfsupervisedclassificationofcovid19fromlungctscanimagesminimizingtherequirementsoflargedataannotation AT mdrahadislambhuiyan anautomatedprivacypreservingselfsupervisedclassificationofcovid19fromlungctscanimagesminimizingtherequirementsoflargedataannotation AT mstsaziatahosin anautomatedprivacypreservingselfsupervisedclassificationofcovid19fromlungctscanimagesminimizingtherequirementsoflargedataannotation AT asifkarim anautomatedprivacypreservingselfsupervisedclassificationofcovid19fromlungctscanimagesminimizingtherequirementsoflargedataannotation AT sidratulmontaha anautomatedprivacypreservingselfsupervisedclassificationofcovid19fromlungctscanimagesminimizingtherequirementsoflargedataannotation AT mdmehedihassan anautomatedprivacypreservingselfsupervisedclassificationofcovid19fromlungctscanimagesminimizingtherequirementsoflargedataannotation AT mohdasifshah anautomatedprivacypreservingselfsupervisedclassificationofcovid19fromlungctscanimagesminimizingtherequirementsoflargedataannotation AT samiazam anautomatedprivacypreservingselfsupervisedclassificationofcovid19fromlungctscanimagesminimizingtherequirementsoflargedataannotation AT sadiasultanachowa automatedprivacypreservingselfsupervisedclassificationofcovid19fromlungctscanimagesminimizingtherequirementsoflargedataannotation AT mdrahadislambhuiyan automatedprivacypreservingselfsupervisedclassificationofcovid19fromlungctscanimagesminimizingtherequirementsoflargedataannotation AT mstsaziatahosin automatedprivacypreservingselfsupervisedclassificationofcovid19fromlungctscanimagesminimizingtherequirementsoflargedataannotation AT asifkarim automatedprivacypreservingselfsupervisedclassificationofcovid19fromlungctscanimagesminimizingtherequirementsoflargedataannotation AT sidratulmontaha automatedprivacypreservingselfsupervisedclassificationofcovid19fromlungctscanimagesminimizingtherequirementsoflargedataannotation AT mdmehedihassan automatedprivacypreservingselfsupervisedclassificationofcovid19fromlungctscanimagesminimizingtherequirementsoflargedataannotation AT mohdasifshah automatedprivacypreservingselfsupervisedclassificationofcovid19fromlungctscanimagesminimizingtherequirementsoflargedataannotation AT samiazam automatedprivacypreservingselfsupervisedclassificationofcovid19fromlungctscanimagesminimizingtherequirementsoflargedataannotation |