Recent advances in deep learning and language models for studying the microbiome

Recent advancements in deep learning, particularly large language models (LLMs), made a significant impact on how researchers study microbiome and metagenomics data. Microbial protein and genomic sequences, like natural languages, form a language of life, enabling the adoption of LLMs to extract use...

Full description

Saved in:
Bibliographic Details
Main Authors: Binghao Yan, Yunbi Nam, Lingyao Li, Rebecca A. Deek, Hongzhe Li, Siyuan Ma
Format: Article
Language:English
Published: Frontiers Media S.A. 2025-01-01
Series:Frontiers in Genetics
Subjects:
Online Access:https://www.frontiersin.org/articles/10.3389/fgene.2024.1494474/full
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1841556644553555968
author Binghao Yan
Yunbi Nam
Lingyao Li
Rebecca A. Deek
Hongzhe Li
Siyuan Ma
author_facet Binghao Yan
Yunbi Nam
Lingyao Li
Rebecca A. Deek
Hongzhe Li
Siyuan Ma
author_sort Binghao Yan
collection DOAJ
description Recent advancements in deep learning, particularly large language models (LLMs), made a significant impact on how researchers study microbiome and metagenomics data. Microbial protein and genomic sequences, like natural languages, form a language of life, enabling the adoption of LLMs to extract useful insights from complex microbial ecologies. In this paper, we review applications of deep learning and language models in analyzing microbiome and metagenomics data. We focus on problem formulations, necessary datasets, and the integration of language modeling techniques. We provide an extensive overview of protein/genomic language modeling and their contributions to microbiome studies. We also discuss applications such as novel viromics language modeling, biosynthetic gene cluster prediction, and knowledge integration for metagenomics studies.
format Article
id doaj-art-68401cc9cabc4bf19a9aa0b552203616
institution Kabale University
issn 1664-8021
language English
publishDate 2025-01-01
publisher Frontiers Media S.A.
record_format Article
series Frontiers in Genetics
spelling doaj-art-68401cc9cabc4bf19a9aa0b5522036162025-01-07T06:46:03ZengFrontiers Media S.A.Frontiers in Genetics1664-80212025-01-011510.3389/fgene.2024.14944741494474Recent advances in deep learning and language models for studying the microbiomeBinghao Yan0Yunbi Nam1Lingyao Li2Rebecca A. Deek3Hongzhe Li4Siyuan Ma5Department of Biostatistics, Epidemiology, and Informatics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, United StatesDepartment of Biostatistics, Vanderbilt University Medical Center, Nashville, TN, United StatesSchool of Information, University of South Florida, Tampa, FL, United StatesDepartment of Biostatistics and Health Data Science, University of Pittsburgh, Pittsburgh, PA, United StatesDepartment of Biostatistics, Epidemiology, and Informatics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, United StatesDepartment of Biostatistics, Vanderbilt University Medical Center, Nashville, TN, United StatesRecent advancements in deep learning, particularly large language models (LLMs), made a significant impact on how researchers study microbiome and metagenomics data. Microbial protein and genomic sequences, like natural languages, form a language of life, enabling the adoption of LLMs to extract useful insights from complex microbial ecologies. In this paper, we review applications of deep learning and language models in analyzing microbiome and metagenomics data. We focus on problem formulations, necessary datasets, and the integration of language modeling techniques. We provide an extensive overview of protein/genomic language modeling and their contributions to microbiome studies. We also discuss applications such as novel viromics language modeling, biosynthetic gene cluster prediction, and knowledge integration for metagenomics studies.https://www.frontiersin.org/articles/10.3389/fgene.2024.1494474/fullmicrobiomeviromeartificial intelligencelarge language modelstransformerattention
spellingShingle Binghao Yan
Yunbi Nam
Lingyao Li
Rebecca A. Deek
Hongzhe Li
Siyuan Ma
Recent advances in deep learning and language models for studying the microbiome
Frontiers in Genetics
microbiome
virome
artificial intelligence
large language models
transformer
attention
title Recent advances in deep learning and language models for studying the microbiome
title_full Recent advances in deep learning and language models for studying the microbiome
title_fullStr Recent advances in deep learning and language models for studying the microbiome
title_full_unstemmed Recent advances in deep learning and language models for studying the microbiome
title_short Recent advances in deep learning and language models for studying the microbiome
title_sort recent advances in deep learning and language models for studying the microbiome
topic microbiome
virome
artificial intelligence
large language models
transformer
attention
url https://www.frontiersin.org/articles/10.3389/fgene.2024.1494474/full
work_keys_str_mv AT binghaoyan recentadvancesindeeplearningandlanguagemodelsforstudyingthemicrobiome
AT yunbinam recentadvancesindeeplearningandlanguagemodelsforstudyingthemicrobiome
AT lingyaoli recentadvancesindeeplearningandlanguagemodelsforstudyingthemicrobiome
AT rebeccaadeek recentadvancesindeeplearningandlanguagemodelsforstudyingthemicrobiome
AT hongzheli recentadvancesindeeplearningandlanguagemodelsforstudyingthemicrobiome
AT siyuanma recentadvancesindeeplearningandlanguagemodelsforstudyingthemicrobiome