Predicting phage-host interaction via hyperbolic Poincaré graph embedding and large-scale protein language technique
Summary: Bacteriophages (phages) are increasingly viewed as a promising alternative for the treatment of antibiotic-resistant bacterial infections. However, the diversity of host ranges complicates the identification of target phages. Existing computational tools often fail to accurately identify ph...
Saved in:
Main Authors: | , , , , , , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Elsevier
2025-01-01
|
Series: | iScience |
Subjects: | |
Online Access: | http://www.sciencedirect.com/science/article/pii/S2589004224028748 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
_version_ | 1841555668169916416 |
---|---|
author | Jie Pan Rui Wang Wenjing Liu Li Wang Zhuhong You Yuechao Li Zhemeng Duan Qinghua Huang Jie Feng Yanmei Sun Shiwei Wang |
author_facet | Jie Pan Rui Wang Wenjing Liu Li Wang Zhuhong You Yuechao Li Zhemeng Duan Qinghua Huang Jie Feng Yanmei Sun Shiwei Wang |
author_sort | Jie Pan |
collection | DOAJ |
description | Summary: Bacteriophages (phages) are increasingly viewed as a promising alternative for the treatment of antibiotic-resistant bacterial infections. However, the diversity of host ranges complicates the identification of target phages. Existing computational tools often fail to accurately identify phages across different bacterial species. In this study, we present GE-PHI, a machine-learning-based model for predicting phage-host interactions (PHIs) by integrating knowledge graph embedding algorithm with a large-scale protein language model. First, a phage-host heterogeneous association network (PHAN) was constructed that incorporated phage-phage and host-host similarity networks. Then, the multi-relational Poincaré graph embedding (MuRP) was used to extract topological patterns. Additionally, we employed the ESM-2 protein language model to capture evolutionary information from phage tail proteins and host-receptor-binding proteins. GE-PHI achieved a cross-validation area under the curve (AUC) of up to 0.9453 in silico and maintains this performance in case studies. This study provides insights into machine-learning-guided phage therapeutics and diagnostics in microbial engineering. |
format | Article |
id | doaj-art-445472c60f8344288e49e959da081263 |
institution | Kabale University |
issn | 2589-0042 |
language | English |
publishDate | 2025-01-01 |
publisher | Elsevier |
record_format | Article |
series | iScience |
spelling | doaj-art-445472c60f8344288e49e959da0812632025-01-08T04:53:18ZengElsevieriScience2589-00422025-01-01281111647Predicting phage-host interaction via hyperbolic Poincaré graph embedding and large-scale protein language techniqueJie Pan0Rui Wang1Wenjing Liu2Li Wang3Zhuhong You4Yuechao Li5Zhemeng Duan6Qinghua Huang7Jie Feng8Yanmei Sun9Shiwei Wang10Key Laboratory of Resources Biology and Biotechnology in Western China, Ministry of Education, Provincial Key Laboratory of Biotechnology of Shaanxi Province, the College of Life Sciences, Northwest University, Xi’an 710069, ChinaDepartment of Ophthalmology, The First Affiliated Hospital of Northwest University, 30 Fenxiang, the South Avenue, Xi’an, Shaanxi 710002, ChinaKey Laboratory of Resources Biology and Biotechnology in Western China, Ministry of Education, Provincial Key Laboratory of Biotechnology of Shaanxi Province, the College of Life Sciences, Northwest University, Xi’an 710069, ChinaKey Laboratory of Resources Biology and Biotechnology in Western China, Ministry of Education, Provincial Key Laboratory of Biotechnology of Shaanxi Province, the College of Life Sciences, Northwest University, Xi’an 710069, ChinaSchool of Computer Science, Northwestern Polytechnical University, Xi’an 710129, ChinaSchool of Computer Science, Northwestern Polytechnical University, Xi’an 710129, ChinaKey Laboratory of Resources Biology and Biotechnology in Western China, Ministry of Education, Provincial Key Laboratory of Biotechnology of Shaanxi Province, the College of Life Sciences, Northwest University, Xi’an 710069, ChinaSchool of Artificial Intelligence, OPtics and ElectroNics (iOPEN), Northwestern Polytechnical University, Xi’an, Shaanxi 710072, ChinaState Key Laboratory of Microbial Resources, Institute of Microbiology, Chinese Academy of Sciences, Beijing, ChinaKey Laboratory of Resources Biology and Biotechnology in Western China, Ministry of Education, Provincial Key Laboratory of Biotechnology of Shaanxi Province, the College of Life Sciences, Northwest University, Xi’an 710069, China; Corresponding authorKey Laboratory of Resources Biology and Biotechnology in Western China, Ministry of Education, Provincial Key Laboratory of Biotechnology of Shaanxi Province, the College of Life Sciences, Northwest University, Xi’an 710069, China; Corresponding authorSummary: Bacteriophages (phages) are increasingly viewed as a promising alternative for the treatment of antibiotic-resistant bacterial infections. However, the diversity of host ranges complicates the identification of target phages. Existing computational tools often fail to accurately identify phages across different bacterial species. In this study, we present GE-PHI, a machine-learning-based model for predicting phage-host interactions (PHIs) by integrating knowledge graph embedding algorithm with a large-scale protein language model. First, a phage-host heterogeneous association network (PHAN) was constructed that incorporated phage-phage and host-host similarity networks. Then, the multi-relational Poincaré graph embedding (MuRP) was used to extract topological patterns. Additionally, we employed the ESM-2 protein language model to capture evolutionary information from phage tail proteins and host-receptor-binding proteins. GE-PHI achieved a cross-validation area under the curve (AUC) of up to 0.9453 in silico and maintains this performance in case studies. This study provides insights into machine-learning-guided phage therapeutics and diagnostics in microbial engineering.http://www.sciencedirect.com/science/article/pii/S2589004224028748VirologyMicrobiologyBacteriologyMachine learning |
spellingShingle | Jie Pan Rui Wang Wenjing Liu Li Wang Zhuhong You Yuechao Li Zhemeng Duan Qinghua Huang Jie Feng Yanmei Sun Shiwei Wang Predicting phage-host interaction via hyperbolic Poincaré graph embedding and large-scale protein language technique iScience Virology Microbiology Bacteriology Machine learning |
title | Predicting phage-host interaction via hyperbolic Poincaré graph embedding and large-scale protein language technique |
title_full | Predicting phage-host interaction via hyperbolic Poincaré graph embedding and large-scale protein language technique |
title_fullStr | Predicting phage-host interaction via hyperbolic Poincaré graph embedding and large-scale protein language technique |
title_full_unstemmed | Predicting phage-host interaction via hyperbolic Poincaré graph embedding and large-scale protein language technique |
title_short | Predicting phage-host interaction via hyperbolic Poincaré graph embedding and large-scale protein language technique |
title_sort | predicting phage host interaction via hyperbolic poincare graph embedding and large scale protein language technique |
topic | Virology Microbiology Bacteriology Machine learning |
url | http://www.sciencedirect.com/science/article/pii/S2589004224028748 |
work_keys_str_mv | AT jiepan predictingphagehostinteractionviahyperbolicpoincaregraphembeddingandlargescaleproteinlanguagetechnique AT ruiwang predictingphagehostinteractionviahyperbolicpoincaregraphembeddingandlargescaleproteinlanguagetechnique AT wenjingliu predictingphagehostinteractionviahyperbolicpoincaregraphembeddingandlargescaleproteinlanguagetechnique AT liwang predictingphagehostinteractionviahyperbolicpoincaregraphembeddingandlargescaleproteinlanguagetechnique AT zhuhongyou predictingphagehostinteractionviahyperbolicpoincaregraphembeddingandlargescaleproteinlanguagetechnique AT yuechaoli predictingphagehostinteractionviahyperbolicpoincaregraphembeddingandlargescaleproteinlanguagetechnique AT zhemengduan predictingphagehostinteractionviahyperbolicpoincaregraphembeddingandlargescaleproteinlanguagetechnique AT qinghuahuang predictingphagehostinteractionviahyperbolicpoincaregraphembeddingandlargescaleproteinlanguagetechnique AT jiefeng predictingphagehostinteractionviahyperbolicpoincaregraphembeddingandlargescaleproteinlanguagetechnique AT yanmeisun predictingphagehostinteractionviahyperbolicpoincaregraphembeddingandlargescaleproteinlanguagetechnique AT shiweiwang predictingphagehostinteractionviahyperbolicpoincaregraphembeddingandlargescaleproteinlanguagetechnique |