Conformal novelty detection for multiple metabolic networks

Abstract Background Graphical representations are useful to model complex data in general and biological interactions in particular. Our main motivation is the comparison of metabolic networks in the wider context of developing noninvasive accurate diagnostic tools. However, comparison and classific...

Full description

Saved in:
Bibliographic Details
Main Authors: Ariane Marandon, Tabea Rebafka, Nataliya Sokolovska, Hédi Soula
Format: Article
Language:English
Published: BMC 2024-11-01
Series:BMC Bioinformatics
Subjects:
Online Access:https://doi.org/10.1186/s12859-024-05971-8
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1846164813718224896
author Ariane Marandon
Tabea Rebafka
Nataliya Sokolovska
Hédi Soula
author_facet Ariane Marandon
Tabea Rebafka
Nataliya Sokolovska
Hédi Soula
author_sort Ariane Marandon
collection DOAJ
description Abstract Background Graphical representations are useful to model complex data in general and biological interactions in particular. Our main motivation is the comparison of metabolic networks in the wider context of developing noninvasive accurate diagnostic tools. However, comparison and classification of graphs is still extremely challenging, although a number of highly efficient methods such as graph neural networks were developed in the recent decade. Important aspects are still lacking in graph classification: interpretability and guarantees on classification quality, i.e., control of the risk level or false discovery rate control. Results In our contribution, we introduce a statistically sound approach to control the false discovery rate in a classification task for graphs in a semi-supervised setting. Our procedure identifies novelties in a dataset, where a graph is considered to be a novelty when its topology is significantly different from those in the reference class. It is noteworthy that the procedure is a conformal prediction approach, which does not make any distributional assumptions on the data and that can be seen as a wrapper around traditional machine learning models, so that it takes full advantage of existing methods. The performance of the proposed method is assessed on several standard benchmarks. It is also adapted and applied to the difficult task of classifying metabolic networks, where each graph is a representation of all metabolic reactions of a bacterium and to real task from a cancer data repository. Conclusions Our approach efficiently controls — in highly complex data — the false discovery rate, while maximizing the true discovery rate to get the most reasonable predictive performance. This contribution is focused on confident classification of complex data, what can be further used to explore complex human pathologies and their mechanisms.
format Article
id doaj-art-36b5f1d4faee491c9b4c96463c1595b6
institution Kabale University
issn 1471-2105
language English
publishDate 2024-11-01
publisher BMC
record_format Article
series BMC Bioinformatics
spelling doaj-art-36b5f1d4faee491c9b4c96463c1595b62024-11-17T12:51:18ZengBMCBMC Bioinformatics1471-21052024-11-0125111810.1186/s12859-024-05971-8Conformal novelty detection for multiple metabolic networksAriane Marandon0Tabea Rebafka1Nataliya Sokolovska2Hédi Soula3LPSM, Sorbonne universityLPSM, Sorbonne universityLCQB, Sorbonne universityNutriOmics, Sorbonne universityAbstract Background Graphical representations are useful to model complex data in general and biological interactions in particular. Our main motivation is the comparison of metabolic networks in the wider context of developing noninvasive accurate diagnostic tools. However, comparison and classification of graphs is still extremely challenging, although a number of highly efficient methods such as graph neural networks were developed in the recent decade. Important aspects are still lacking in graph classification: interpretability and guarantees on classification quality, i.e., control of the risk level or false discovery rate control. Results In our contribution, we introduce a statistically sound approach to control the false discovery rate in a classification task for graphs in a semi-supervised setting. Our procedure identifies novelties in a dataset, where a graph is considered to be a novelty when its topology is significantly different from those in the reference class. It is noteworthy that the procedure is a conformal prediction approach, which does not make any distributional assumptions on the data and that can be seen as a wrapper around traditional machine learning models, so that it takes full advantage of existing methods. The performance of the proposed method is assessed on several standard benchmarks. It is also adapted and applied to the difficult task of classifying metabolic networks, where each graph is a representation of all metabolic reactions of a bacterium and to real task from a cancer data repository. Conclusions Our approach efficiently controls — in highly complex data — the false discovery rate, while maximizing the true discovery rate to get the most reasonable predictive performance. This contribution is focused on confident classification of complex data, what can be further used to explore complex human pathologies and their mechanisms.https://doi.org/10.1186/s12859-024-05971-8Novelty detectionConformal predictionWrapper methodMetabolic networksGraph neural networks
spellingShingle Ariane Marandon
Tabea Rebafka
Nataliya Sokolovska
Hédi Soula
Conformal novelty detection for multiple metabolic networks
BMC Bioinformatics
Novelty detection
Conformal prediction
Wrapper method
Metabolic networks
Graph neural networks
title Conformal novelty detection for multiple metabolic networks
title_full Conformal novelty detection for multiple metabolic networks
title_fullStr Conformal novelty detection for multiple metabolic networks
title_full_unstemmed Conformal novelty detection for multiple metabolic networks
title_short Conformal novelty detection for multiple metabolic networks
title_sort conformal novelty detection for multiple metabolic networks
topic Novelty detection
Conformal prediction
Wrapper method
Metabolic networks
Graph neural networks
url https://doi.org/10.1186/s12859-024-05971-8
work_keys_str_mv AT arianemarandon conformalnoveltydetectionformultiplemetabolicnetworks
AT tabearebafka conformalnoveltydetectionformultiplemetabolicnetworks
AT nataliyasokolovska conformalnoveltydetectionformultiplemetabolicnetworks
AT hedisoula conformalnoveltydetectionformultiplemetabolicnetworks