An automated privacy-preserving self-supervised classification of COVID-19 from lung CT scan images minimizing the requirements of large data annotation

Abstract This study presents a novel privacy-preserving self-supervised (SSL) framework for COVID-19 classification from lung CT scans, utilizing federated learning (FL) enhanced with Paillier homomorphic encryption (PHE) to prevent third-party attacks during training. The FL-SSL based framework emp...

Full description

Saved in:

Bibliographic Details
Main Authors:	Sadia Sultana Chowa, Md Rahad Islam Bhuiyan, Mst. Sazia Tahosin, Asif Karim, Sidratul Montaha, Md. Mehedi Hassan, Mohd Asif Shah, Sami Azam
Format:	Article
Language:	English
Published:	Nature Portfolio 2025-01-01
Series:	Scientific Reports
Subjects:	Self-supervised learning Contrastive learning VGG-19 Attention-CNN Federated learning Privacy-preserving
Online Access:	https://doi.org/10.1038/s41598-024-83972-6
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1841559653383667712
author	Sadia Sultana Chowa Md Rahad Islam Bhuiyan Mst. Sazia Tahosin Asif Karim Sidratul Montaha Md. Mehedi Hassan Mohd Asif Shah Sami Azam
author_facet	Sadia Sultana Chowa Md Rahad Islam Bhuiyan Mst. Sazia Tahosin Asif Karim Sidratul Montaha Md. Mehedi Hassan Mohd Asif Shah Sami Azam
author_sort	Sadia Sultana Chowa
collection	DOAJ
description	Abstract This study presents a novel privacy-preserving self-supervised (SSL) framework for COVID-19 classification from lung CT scans, utilizing federated learning (FL) enhanced with Paillier homomorphic encryption (PHE) to prevent third-party attacks during training. The FL-SSL based framework employs two publicly available lung CT scan datasets which are considered as labeled and an unlabeled dataset. The unlabeled dataset is split into three subsets which are assumed to be collected from three hospitals. Training is done using the Bootstrap Your Own Latent (BYOL) contrastive learning SSL framework with a VGG19 encoder followed by attention CNN blocks (VGG19 + attention CNN). The input datasets are processed by selecting the largest lung portion of each lung CT scan using an automated selection approach and a 64 × 64 input size is utilized to reduce computational complexity. Healthcare privacy issues are addressed by collaborative training across decentralized datasets and secure aggregation with PHE, underscoring the effectiveness of this approach. Three subsets of the dataset are used to train the local BYOL model, which together optimizes the central encoder. The labeled dataset is employed to train the central encoder (updated VGG19 + attention CNN), resulting in an accuracy of 97.19%, a precision of 97.43%, and a recall of 98.18%. The reliability of the framework’s performance is demonstrated through statistical analysis and five-fold cross-validation. The efficacy of the proposed framework is further showcased by showing its performance on three distinct modality datasets: skin cancer, breast cancer, and chest X-rays. In conclusion, this study offers a promising solution for accurate diagnosis of chest X-rays, preserving privacy and overcoming the challenges of dataset scarcity and computational complexity.
format	Article
id	doaj-art-406c4126128f4f169149a25e4134699f
institution	Kabale University
issn	2045-2322
language	English
publishDate	2025-01-01
publisher	Nature Portfolio
record_format	Article
series	Scientific Reports
spelling	doaj-art-406c4126128f4f169149a25e4134699f2025-01-05T12:18:21ZengNature PortfolioScientific Reports2045-23222025-01-0115112010.1038/s41598-024-83972-6An automated privacy-preserving self-supervised classification of COVID-19 from lung CT scan images minimizing the requirements of large data annotationSadia Sultana Chowa0Md Rahad Islam Bhuiyan1Mst. Sazia Tahosin2Asif Karim3Sidratul Montaha4Md. Mehedi Hassan5Mohd Asif Shah6Sami Azam7Faculty of Science and Technology, Charles Darwin UniversityFaculty of Science and Technology, Charles Darwin UniversityHealth Informatics Research Laboratory (HIRL), Department of Computer Science and Engineering, Daffodil International UniversityFaculty of Science and Technology, Charles Darwin UniversityDepartment of Computer Science, University of CalgaryComputer Science and Engineering Discipline, Khulna UniversityDepartment of Economics, Bakhtar UniversityFaculty of Science and Technology, Charles Darwin UniversityAbstract This study presents a novel privacy-preserving self-supervised (SSL) framework for COVID-19 classification from lung CT scans, utilizing federated learning (FL) enhanced with Paillier homomorphic encryption (PHE) to prevent third-party attacks during training. The FL-SSL based framework employs two publicly available lung CT scan datasets which are considered as labeled and an unlabeled dataset. The unlabeled dataset is split into three subsets which are assumed to be collected from three hospitals. Training is done using the Bootstrap Your Own Latent (BYOL) contrastive learning SSL framework with a VGG19 encoder followed by attention CNN blocks (VGG19 + attention CNN). The input datasets are processed by selecting the largest lung portion of each lung CT scan using an automated selection approach and a 64 × 64 input size is utilized to reduce computational complexity. Healthcare privacy issues are addressed by collaborative training across decentralized datasets and secure aggregation with PHE, underscoring the effectiveness of this approach. Three subsets of the dataset are used to train the local BYOL model, which together optimizes the central encoder. The labeled dataset is employed to train the central encoder (updated VGG19 + attention CNN), resulting in an accuracy of 97.19%, a precision of 97.43%, and a recall of 98.18%. The reliability of the framework’s performance is demonstrated through statistical analysis and five-fold cross-validation. The efficacy of the proposed framework is further showcased by showing its performance on three distinct modality datasets: skin cancer, breast cancer, and chest X-rays. In conclusion, this study offers a promising solution for accurate diagnosis of chest X-rays, preserving privacy and overcoming the challenges of dataset scarcity and computational complexity.https://doi.org/10.1038/s41598-024-83972-6Self-supervised learningContrastive learningVGG-19Attention-CNNFederated learningPrivacy-preserving
spellingShingle	Sadia Sultana Chowa Md Rahad Islam Bhuiyan Mst. Sazia Tahosin Asif Karim Sidratul Montaha Md. Mehedi Hassan Mohd Asif Shah Sami Azam An automated privacy-preserving self-supervised classification of COVID-19 from lung CT scan images minimizing the requirements of large data annotation Scientific Reports Self-supervised learning Contrastive learning VGG-19 Attention-CNN Federated learning Privacy-preserving
title	An automated privacy-preserving self-supervised classification of COVID-19 from lung CT scan images minimizing the requirements of large data annotation
title_full	An automated privacy-preserving self-supervised classification of COVID-19 from lung CT scan images minimizing the requirements of large data annotation
title_fullStr	An automated privacy-preserving self-supervised classification of COVID-19 from lung CT scan images minimizing the requirements of large data annotation
title_full_unstemmed	An automated privacy-preserving self-supervised classification of COVID-19 from lung CT scan images minimizing the requirements of large data annotation
title_short	An automated privacy-preserving self-supervised classification of COVID-19 from lung CT scan images minimizing the requirements of large data annotation
title_sort	automated privacy preserving self supervised classification of covid 19 from lung ct scan images minimizing the requirements of large data annotation
topic	Self-supervised learning Contrastive learning VGG-19 Attention-CNN Federated learning Privacy-preserving
url	https://doi.org/10.1038/s41598-024-83972-6
work_keys_str_mv	AT sadiasultanachowa anautomatedprivacypreservingselfsupervisedclassificationofcovid19fromlungctscanimagesminimizingtherequirementsoflargedataannotation AT mdrahadislambhuiyan anautomatedprivacypreservingselfsupervisedclassificationofcovid19fromlungctscanimagesminimizingtherequirementsoflargedataannotation AT mstsaziatahosin anautomatedprivacypreservingselfsupervisedclassificationofcovid19fromlungctscanimagesminimizingtherequirementsoflargedataannotation AT asifkarim anautomatedprivacypreservingselfsupervisedclassificationofcovid19fromlungctscanimagesminimizingtherequirementsoflargedataannotation AT sidratulmontaha anautomatedprivacypreservingselfsupervisedclassificationofcovid19fromlungctscanimagesminimizingtherequirementsoflargedataannotation AT mdmehedihassan anautomatedprivacypreservingselfsupervisedclassificationofcovid19fromlungctscanimagesminimizingtherequirementsoflargedataannotation AT mohdasifshah anautomatedprivacypreservingselfsupervisedclassificationofcovid19fromlungctscanimagesminimizingtherequirementsoflargedataannotation AT samiazam anautomatedprivacypreservingselfsupervisedclassificationofcovid19fromlungctscanimagesminimizingtherequirementsoflargedataannotation AT sadiasultanachowa automatedprivacypreservingselfsupervisedclassificationofcovid19fromlungctscanimagesminimizingtherequirementsoflargedataannotation AT mdrahadislambhuiyan automatedprivacypreservingselfsupervisedclassificationofcovid19fromlungctscanimagesminimizingtherequirementsoflargedataannotation AT mstsaziatahosin automatedprivacypreservingselfsupervisedclassificationofcovid19fromlungctscanimagesminimizingtherequirementsoflargedataannotation AT asifkarim automatedprivacypreservingselfsupervisedclassificationofcovid19fromlungctscanimagesminimizingtherequirementsoflargedataannotation AT sidratulmontaha automatedprivacypreservingselfsupervisedclassificationofcovid19fromlungctscanimagesminimizingtherequirementsoflargedataannotation AT mdmehedihassan automatedprivacypreservingselfsupervisedclassificationofcovid19fromlungctscanimagesminimizingtherequirementsoflargedataannotation AT mohdasifshah automatedprivacypreservingselfsupervisedclassificationofcovid19fromlungctscanimagesminimizingtherequirementsoflargedataannotation AT samiazam automatedprivacypreservingselfsupervisedclassificationofcovid19fromlungctscanimagesminimizingtherequirementsoflargedataannotation

An automated privacy-preserving self-supervised classification of COVID-19 from lung CT scan images minimizing the requirements of large data annotation

Similar Items