Mining contextually meaningful subgraphs from a vertex-attributed graph

Abstract Networks have emerged as a natural data structure to represent relations among entities. Proteins interact to carry out cellular functions and protein-Protein interaction network analysis has been employed for understanding the cellular machinery. Advances in genomics technologies enabled t...

Full description

Saved in:
Bibliographic Details
Main Authors: Riyad Hakim, Saeed Salem
Format: Article
Language:English
Published: BMC 2024-11-01
Series:BMC Bioinformatics
Subjects:
Online Access:https://doi.org/10.1186/s12859-024-05960-x
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1846164843741052928
author Riyad Hakim
Saeed Salem
author_facet Riyad Hakim
Saeed Salem
author_sort Riyad Hakim
collection DOAJ
description Abstract Networks have emerged as a natural data structure to represent relations among entities. Proteins interact to carry out cellular functions and protein-Protein interaction network analysis has been employed for understanding the cellular machinery. Advances in genomics technologies enabled the collection of large data that annotate proteins in interaction networks. Integrative analysis of interaction networks with gene expression and annotations enables the discovery of context-specific complexes and improves the identification of functional modules and pathways. Extracting subnetworks whose vertices are connected and have high attribute similarity have applications in diverse domains. We present an enumeration approach for mining sets of connected and cohesive subgraphs, where vertices in the subgraphs have similar attribute profile. Due to the large number of cohesive connected subgraphs and to overcome the overlap among these subgraphs, we propose an algorithm for enumerating a set of representative subgraphs, the set of all closed subgraphs. We propose pruning strategies for efficiently enumerating the search tree without missing any pattern or reporting duplicate subgraphs. On a real protein-protein interaction network with attributes representing the dysregulation profile of genes in multiple cancers, we mine closed cohesive connected subnetworks and show their biological significance. Moreover, we conduct a runtime comparison with existing algorithms to show the efficiency of our proposed algorithm.
format Article
id doaj-art-e090ae5c2c644f73a68cf1b6fa35d6a6
institution Kabale University
issn 1471-2105
language English
publishDate 2024-11-01
publisher BMC
record_format Article
series BMC Bioinformatics
spelling doaj-art-e090ae5c2c644f73a68cf1b6fa35d6a62024-11-17T12:51:17ZengBMCBMC Bioinformatics1471-21052024-11-0125112510.1186/s12859-024-05960-xMining contextually meaningful subgraphs from a vertex-attributed graphRiyad Hakim0Saeed Salem1Computer Science, North Dakota State UniversityComputer Science and Engineering, Qatar UniversityAbstract Networks have emerged as a natural data structure to represent relations among entities. Proteins interact to carry out cellular functions and protein-Protein interaction network analysis has been employed for understanding the cellular machinery. Advances in genomics technologies enabled the collection of large data that annotate proteins in interaction networks. Integrative analysis of interaction networks with gene expression and annotations enables the discovery of context-specific complexes and improves the identification of functional modules and pathways. Extracting subnetworks whose vertices are connected and have high attribute similarity have applications in diverse domains. We present an enumeration approach for mining sets of connected and cohesive subgraphs, where vertices in the subgraphs have similar attribute profile. Due to the large number of cohesive connected subgraphs and to overcome the overlap among these subgraphs, we propose an algorithm for enumerating a set of representative subgraphs, the set of all closed subgraphs. We propose pruning strategies for efficiently enumerating the search tree without missing any pattern or reporting duplicate subgraphs. On a real protein-protein interaction network with attributes representing the dysregulation profile of genes in multiple cancers, we mine closed cohesive connected subnetworks and show their biological significance. Moreover, we conduct a runtime comparison with existing algorithms to show the efficiency of our proposed algorithm.https://doi.org/10.1186/s12859-024-05960-xAttributed graphSubgraph enumerationCohesive subgraphMinimum supportMaximal subgraphClosed subgraph
spellingShingle Riyad Hakim
Saeed Salem
Mining contextually meaningful subgraphs from a vertex-attributed graph
BMC Bioinformatics
Attributed graph
Subgraph enumeration
Cohesive subgraph
Minimum support
Maximal subgraph
Closed subgraph
title Mining contextually meaningful subgraphs from a vertex-attributed graph
title_full Mining contextually meaningful subgraphs from a vertex-attributed graph
title_fullStr Mining contextually meaningful subgraphs from a vertex-attributed graph
title_full_unstemmed Mining contextually meaningful subgraphs from a vertex-attributed graph
title_short Mining contextually meaningful subgraphs from a vertex-attributed graph
title_sort mining contextually meaningful subgraphs from a vertex attributed graph
topic Attributed graph
Subgraph enumeration
Cohesive subgraph
Minimum support
Maximal subgraph
Closed subgraph
url https://doi.org/10.1186/s12859-024-05960-x
work_keys_str_mv AT riyadhakim miningcontextuallymeaningfulsubgraphsfromavertexattributedgraph
AT saeedsalem miningcontextuallymeaningfulsubgraphsfromavertexattributedgraph