Semi-supervised contrastive learning variational autoencoder Integrating single-cell multimodal mosaic datasets

Abstract As single-cell sequencing technology became widely used, scientists found that single-modality data alone could not fully meet the research needs of complex biological systems. To address this issue, researchers began simultaneously collect multi-modal single-cell omics data. But different...

Full description

Saved in:

Bibliographic Details
Main Authors:	Zihao Wang, Zeyu Wu, Minghua Deng
Format:	Article
Language:	English
Published:	BMC 2025-08-01
Series:	BMC Bioinformatics
Subjects:	Mosaic intergrate Single-cell multimodal Batch effect
Online Access:	https://doi.org/10.1186/s12859-025-06239-5
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1849332259516055552
author	Zihao Wang Zeyu Wu Minghua Deng
author_facet	Zihao Wang Zeyu Wu Minghua Deng
author_sort	Zihao Wang
collection	DOAJ
description	Abstract As single-cell sequencing technology became widely used, scientists found that single-modality data alone could not fully meet the research needs of complex biological systems. To address this issue, researchers began simultaneously collect multi-modal single-cell omics data. But different sequencing technologies often result in datasets where one or more data modalities are missing. Therefore, mosaic datasets are more common when we analyze. However, the high dimensionality and sparsity of the data increase the difficulty, and the presence of batch effects poses an additional challenge. To address these challenges, we proposes a flexible integration framework based on Variational Autoencoder called scGCM. The main task of scGCM is to integrate single-cell multimodal mosaic data and eliminate batch effects. This method was conducted on multiple datasets, encompassing different modalities of single-cell data. The results demonstrate that, compared to state-of-the-art multimodal data integration methods, scGCM offers significant advantages in clustering accuracy and data consistency. The source code of scGCM can be accessed at https://github.com/closmouz/scCGM .
format	Article
id	doaj-art-9eed07cdc94c4fa59f1c4d923d16f811
institution	Kabale University
issn	1471-2105
language	English
publishDate	2025-08-01
publisher	BMC
record_format	Article
series	BMC Bioinformatics
spelling	doaj-art-9eed07cdc94c4fa59f1c4d923d16f8112025-08-20T03:46:15ZengBMCBMC Bioinformatics1471-21052025-08-0126111310.1186/s12859-025-06239-5Semi-supervised contrastive learning variational autoencoder Integrating single-cell multimodal mosaic datasetsZihao Wang0Zeyu Wu1Minghua Deng2Biomedical Interdisciplinary Research Center, Peking UniversitySchool of Mathematical Sciences, Peking UniversityBiomedical Interdisciplinary Research Center, Peking UniversityAbstract As single-cell sequencing technology became widely used, scientists found that single-modality data alone could not fully meet the research needs of complex biological systems. To address this issue, researchers began simultaneously collect multi-modal single-cell omics data. But different sequencing technologies often result in datasets where one or more data modalities are missing. Therefore, mosaic datasets are more common when we analyze. However, the high dimensionality and sparsity of the data increase the difficulty, and the presence of batch effects poses an additional challenge. To address these challenges, we proposes a flexible integration framework based on Variational Autoencoder called scGCM. The main task of scGCM is to integrate single-cell multimodal mosaic data and eliminate batch effects. This method was conducted on multiple datasets, encompassing different modalities of single-cell data. The results demonstrate that, compared to state-of-the-art multimodal data integration methods, scGCM offers significant advantages in clustering accuracy and data consistency. The source code of scGCM can be accessed at https://github.com/closmouz/scCGM .https://doi.org/10.1186/s12859-025-06239-5Mosaic intergrateSingle-cell multimodalBatch effect
spellingShingle	Zihao Wang Zeyu Wu Minghua Deng Semi-supervised contrastive learning variational autoencoder Integrating single-cell multimodal mosaic datasets BMC Bioinformatics Mosaic intergrate Single-cell multimodal Batch effect
title	Semi-supervised contrastive learning variational autoencoder Integrating single-cell multimodal mosaic datasets
title_full	Semi-supervised contrastive learning variational autoencoder Integrating single-cell multimodal mosaic datasets
title_fullStr	Semi-supervised contrastive learning variational autoencoder Integrating single-cell multimodal mosaic datasets
title_full_unstemmed	Semi-supervised contrastive learning variational autoencoder Integrating single-cell multimodal mosaic datasets
title_short	Semi-supervised contrastive learning variational autoencoder Integrating single-cell multimodal mosaic datasets
title_sort	semi supervised contrastive learning variational autoencoder integrating single cell multimodal mosaic datasets
topic	Mosaic intergrate Single-cell multimodal Batch effect
url	https://doi.org/10.1186/s12859-025-06239-5
work_keys_str_mv	AT zihaowang semisupervisedcontrastivelearningvariationalautoencoderintegratingsinglecellmultimodalmosaicdatasets AT zeyuwu semisupervisedcontrastivelearningvariationalautoencoderintegratingsinglecellmultimodalmosaicdatasets AT minghuadeng semisupervisedcontrastivelearningvariationalautoencoderintegratingsinglecellmultimodalmosaicdatasets

Semi-supervised contrastive learning variational autoencoder Integrating single-cell multimodal mosaic datasets

Similar Items