Cancer molecular subtyping using limited multi-omics data with missingness.

Diagnosing cancer subtypes is a prerequisite for precise treatment. Existing multi-omics data fusion-based diagnostic solutions build on the requisite of sufficient samples with complete multi-omics data, which is challenging to obtain in clinical applications. To address the bottleneck of collectin...

Full description

Saved in:
Bibliographic Details
Main Authors: Yongqi Bu, Jiaxuan Liang, Zhen Li, Jianbo Wang, Jun Wang, Guoxian Yu
Format: Article
Language:English
Published: Public Library of Science (PLoS) 2024-12-01
Series:PLoS Computational Biology
Online Access:https://doi.org/10.1371/journal.pcbi.1012710
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1841533259858575360
author Yongqi Bu
Jiaxuan Liang
Zhen Li
Jianbo Wang
Jun Wang
Guoxian Yu
author_facet Yongqi Bu
Jiaxuan Liang
Zhen Li
Jianbo Wang
Jun Wang
Guoxian Yu
author_sort Yongqi Bu
collection DOAJ
description Diagnosing cancer subtypes is a prerequisite for precise treatment. Existing multi-omics data fusion-based diagnostic solutions build on the requisite of sufficient samples with complete multi-omics data, which is challenging to obtain in clinical applications. To address the bottleneck of collecting sufficient samples with complete data in clinical applications, we proposed a flexible integrative model (CancerSD) to diagnose cancer subtype using limited samples with incomplete multi-omics data. CancerSD designs contrastive learning tasks and masking-and-reconstruction tasks to reliably impute missing omics, and fuses available omics data with the imputed ones to accurately diagnose cancer subtypes. To address the issue of limited clinical samples, it introduces a category-level contrastive loss to extend the meta-learning framework, effectively transferring knowledge from external datasets to pretrain the diagnostic model. Experiments on benchmark datasets show that CancerSD not only gives accurate diagnosis, but also maintains a high authenticity and good interpretability. In addition, CancerSD identifies important molecular characteristics associated with cancer subtypes, and it defines the Integrated CancerSD Score that can serve as an independent predictive factor for patient prognosis.
format Article
id doaj-art-bbd786cca9484892ae2db8d7603b5bf2
institution Kabale University
issn 1553-734X
1553-7358
language English
publishDate 2024-12-01
publisher Public Library of Science (PLoS)
record_format Article
series PLoS Computational Biology
spelling doaj-art-bbd786cca9484892ae2db8d7603b5bf22025-01-17T05:30:55ZengPublic Library of Science (PLoS)PLoS Computational Biology1553-734X1553-73582024-12-012012e101271010.1371/journal.pcbi.1012710Cancer molecular subtyping using limited multi-omics data with missingness.Yongqi BuJiaxuan LiangZhen LiJianbo WangJun WangGuoxian YuDiagnosing cancer subtypes is a prerequisite for precise treatment. Existing multi-omics data fusion-based diagnostic solutions build on the requisite of sufficient samples with complete multi-omics data, which is challenging to obtain in clinical applications. To address the bottleneck of collecting sufficient samples with complete data in clinical applications, we proposed a flexible integrative model (CancerSD) to diagnose cancer subtype using limited samples with incomplete multi-omics data. CancerSD designs contrastive learning tasks and masking-and-reconstruction tasks to reliably impute missing omics, and fuses available omics data with the imputed ones to accurately diagnose cancer subtypes. To address the issue of limited clinical samples, it introduces a category-level contrastive loss to extend the meta-learning framework, effectively transferring knowledge from external datasets to pretrain the diagnostic model. Experiments on benchmark datasets show that CancerSD not only gives accurate diagnosis, but also maintains a high authenticity and good interpretability. In addition, CancerSD identifies important molecular characteristics associated with cancer subtypes, and it defines the Integrated CancerSD Score that can serve as an independent predictive factor for patient prognosis.https://doi.org/10.1371/journal.pcbi.1012710
spellingShingle Yongqi Bu
Jiaxuan Liang
Zhen Li
Jianbo Wang
Jun Wang
Guoxian Yu
Cancer molecular subtyping using limited multi-omics data with missingness.
PLoS Computational Biology
title Cancer molecular subtyping using limited multi-omics data with missingness.
title_full Cancer molecular subtyping using limited multi-omics data with missingness.
title_fullStr Cancer molecular subtyping using limited multi-omics data with missingness.
title_full_unstemmed Cancer molecular subtyping using limited multi-omics data with missingness.
title_short Cancer molecular subtyping using limited multi-omics data with missingness.
title_sort cancer molecular subtyping using limited multi omics data with missingness
url https://doi.org/10.1371/journal.pcbi.1012710
work_keys_str_mv AT yongqibu cancermolecularsubtypingusinglimitedmultiomicsdatawithmissingness
AT jiaxuanliang cancermolecularsubtypingusinglimitedmultiomicsdatawithmissingness
AT zhenli cancermolecularsubtypingusinglimitedmultiomicsdatawithmissingness
AT jianbowang cancermolecularsubtypingusinglimitedmultiomicsdatawithmissingness
AT junwang cancermolecularsubtypingusinglimitedmultiomicsdatawithmissingness
AT guoxianyu cancermolecularsubtypingusinglimitedmultiomicsdatawithmissingness