An automatic end-to-end chemical synthesis development platform powered by large language models

Abstract The rapid emergence of large language model (LLM) technology presents promising opportunities to facilitate the development of synthetic reactions. In this work, we leveraged the power of GPT-4 to build an LLM-based reaction development framework (LLM-RDF) to handle fundamental tasks involv...

Full description

Saved in:
Bibliographic Details
Main Authors: Yixiang Ruan, Chenyin Lu, Ning Xu, Yuchen He, Yixin Chen, Jian Zhang, Jun Xuan, Jianzhang Pan, Qun Fang, Hanyu Gao, Xiaodong Shen, Ning Ye, Qiang Zhang, Yiming Mo
Format: Article
Language:English
Published: Nature Portfolio 2024-11-01
Series:Nature Communications
Online Access:https://doi.org/10.1038/s41467-024-54457-x
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849221084961832960
author Yixiang Ruan
Chenyin Lu
Ning Xu
Yuchen He
Yixin Chen
Jian Zhang
Jun Xuan
Jianzhang Pan
Qun Fang
Hanyu Gao
Xiaodong Shen
Ning Ye
Qiang Zhang
Yiming Mo
author_facet Yixiang Ruan
Chenyin Lu
Ning Xu
Yuchen He
Yixin Chen
Jian Zhang
Jun Xuan
Jianzhang Pan
Qun Fang
Hanyu Gao
Xiaodong Shen
Ning Ye
Qiang Zhang
Yiming Mo
author_sort Yixiang Ruan
collection DOAJ
description Abstract The rapid emergence of large language model (LLM) technology presents promising opportunities to facilitate the development of synthetic reactions. In this work, we leveraged the power of GPT-4 to build an LLM-based reaction development framework (LLM-RDF) to handle fundamental tasks involved throughout the chemical synthesis development. LLM-RDF comprises six specialized LLM-based agents, including Literature Scouter, Experiment Designer, Hardware Executor, Spectrum Analyzer, Separation Instructor, and Result Interpreter, which are pre-prompted to accomplish the designated tasks. A web application with LLM-RDF as the backend was built to allow chemist users to interact with automated experimental platforms and analyze results via natural language, thus, eliminating the need for coding skills and ensuring accessibility for all chemists. We demonstrated the capabilities of LLM-RDF in guiding the end-to-end synthesis development process for the copper/TEMPO catalyzed aerobic alcohol oxidation to aldehyde reaction, including literature search and information extraction, substrate scope and condition screening, reaction kinetics study, reaction condition optimization, reaction scale-up and product purification. Furthermore, LLM-RDF’s broader applicability and versability was validated on various synthesis tasks of three distinct reactions (SNAr reaction, photoredox C-C cross-coupling reaction, and heterogeneous photoelectrochemical reaction).
format Article
id doaj-art-79f16f538fb04daeae430f83d0064274
institution Kabale University
issn 2041-1723
language English
publishDate 2024-11-01
publisher Nature Portfolio
record_format Article
series Nature Communications
spelling doaj-art-79f16f538fb04daeae430f83d00642742024-11-24T12:32:23ZengNature PortfolioNature Communications2041-17232024-11-0115111610.1038/s41467-024-54457-xAn automatic end-to-end chemical synthesis development platform powered by large language modelsYixiang Ruan0Chenyin Lu1Ning Xu2Yuchen He3Yixin Chen4Jian Zhang5Jun Xuan6Jianzhang Pan7Qun Fang8Hanyu Gao9Xiaodong Shen10Ning Ye11Qiang Zhang12Yiming Mo13College of Chemical and Biological Engineering, Zhejiang UniversityZhejiang-Hong Kong Joint Laboratory for Intelligent Molecule and Material Design and Synthesis, ZJU-Hangzhou Global Scientific and Technological Innovation CenterCollege of Chemical and Biological Engineering, Zhejiang UniversityCollege of Chemical and Biological Engineering, Zhejiang UniversityCollege of Chemical and Biological Engineering, Zhejiang UniversityZhejiang-Hong Kong Joint Laboratory for Intelligent Molecule and Material Design and Synthesis, ZJU-Hangzhou Global Scientific and Technological Innovation CenterZhejiang-Hong Kong Joint Laboratory for Intelligent Molecule and Material Design and Synthesis, ZJU-Hangzhou Global Scientific and Technological Innovation CenterZhejiang-Hong Kong Joint Laboratory for Intelligent Molecule and Material Design and Synthesis, ZJU-Hangzhou Global Scientific and Technological Innovation CenterZhejiang-Hong Kong Joint Laboratory for Intelligent Molecule and Material Design and Synthesis, ZJU-Hangzhou Global Scientific and Technological Innovation CenterDepartment of Chemical and Biological Engineering, The Hong Kong University of Science and TechnologyChemical & Analytical Development, Suzhou Novartis Technical Development Co. Ltd.Rezubio Pharmaceuticals Co. Ltd.Zhejiang-Hong Kong Joint Laboratory for Intelligent Molecule and Material Design and Synthesis, ZJU-Hangzhou Global Scientific and Technological Innovation CenterCollege of Chemical and Biological Engineering, Zhejiang UniversityAbstract The rapid emergence of large language model (LLM) technology presents promising opportunities to facilitate the development of synthetic reactions. In this work, we leveraged the power of GPT-4 to build an LLM-based reaction development framework (LLM-RDF) to handle fundamental tasks involved throughout the chemical synthesis development. LLM-RDF comprises six specialized LLM-based agents, including Literature Scouter, Experiment Designer, Hardware Executor, Spectrum Analyzer, Separation Instructor, and Result Interpreter, which are pre-prompted to accomplish the designated tasks. A web application with LLM-RDF as the backend was built to allow chemist users to interact with automated experimental platforms and analyze results via natural language, thus, eliminating the need for coding skills and ensuring accessibility for all chemists. We demonstrated the capabilities of LLM-RDF in guiding the end-to-end synthesis development process for the copper/TEMPO catalyzed aerobic alcohol oxidation to aldehyde reaction, including literature search and information extraction, substrate scope and condition screening, reaction kinetics study, reaction condition optimization, reaction scale-up and product purification. Furthermore, LLM-RDF’s broader applicability and versability was validated on various synthesis tasks of three distinct reactions (SNAr reaction, photoredox C-C cross-coupling reaction, and heterogeneous photoelectrochemical reaction).https://doi.org/10.1038/s41467-024-54457-x
spellingShingle Yixiang Ruan
Chenyin Lu
Ning Xu
Yuchen He
Yixin Chen
Jian Zhang
Jun Xuan
Jianzhang Pan
Qun Fang
Hanyu Gao
Xiaodong Shen
Ning Ye
Qiang Zhang
Yiming Mo
An automatic end-to-end chemical synthesis development platform powered by large language models
Nature Communications
title An automatic end-to-end chemical synthesis development platform powered by large language models
title_full An automatic end-to-end chemical synthesis development platform powered by large language models
title_fullStr An automatic end-to-end chemical synthesis development platform powered by large language models
title_full_unstemmed An automatic end-to-end chemical synthesis development platform powered by large language models
title_short An automatic end-to-end chemical synthesis development platform powered by large language models
title_sort automatic end to end chemical synthesis development platform powered by large language models
url https://doi.org/10.1038/s41467-024-54457-x
work_keys_str_mv AT yixiangruan anautomaticendtoendchemicalsynthesisdevelopmentplatformpoweredbylargelanguagemodels
AT chenyinlu anautomaticendtoendchemicalsynthesisdevelopmentplatformpoweredbylargelanguagemodels
AT ningxu anautomaticendtoendchemicalsynthesisdevelopmentplatformpoweredbylargelanguagemodels
AT yuchenhe anautomaticendtoendchemicalsynthesisdevelopmentplatformpoweredbylargelanguagemodels
AT yixinchen anautomaticendtoendchemicalsynthesisdevelopmentplatformpoweredbylargelanguagemodels
AT jianzhang anautomaticendtoendchemicalsynthesisdevelopmentplatformpoweredbylargelanguagemodels
AT junxuan anautomaticendtoendchemicalsynthesisdevelopmentplatformpoweredbylargelanguagemodels
AT jianzhangpan anautomaticendtoendchemicalsynthesisdevelopmentplatformpoweredbylargelanguagemodels
AT qunfang anautomaticendtoendchemicalsynthesisdevelopmentplatformpoweredbylargelanguagemodels
AT hanyugao anautomaticendtoendchemicalsynthesisdevelopmentplatformpoweredbylargelanguagemodels
AT xiaodongshen anautomaticendtoendchemicalsynthesisdevelopmentplatformpoweredbylargelanguagemodels
AT ningye anautomaticendtoendchemicalsynthesisdevelopmentplatformpoweredbylargelanguagemodels
AT qiangzhang anautomaticendtoendchemicalsynthesisdevelopmentplatformpoweredbylargelanguagemodels
AT yimingmo anautomaticendtoendchemicalsynthesisdevelopmentplatformpoweredbylargelanguagemodels
AT yixiangruan automaticendtoendchemicalsynthesisdevelopmentplatformpoweredbylargelanguagemodels
AT chenyinlu automaticendtoendchemicalsynthesisdevelopmentplatformpoweredbylargelanguagemodels
AT ningxu automaticendtoendchemicalsynthesisdevelopmentplatformpoweredbylargelanguagemodels
AT yuchenhe automaticendtoendchemicalsynthesisdevelopmentplatformpoweredbylargelanguagemodels
AT yixinchen automaticendtoendchemicalsynthesisdevelopmentplatformpoweredbylargelanguagemodels
AT jianzhang automaticendtoendchemicalsynthesisdevelopmentplatformpoweredbylargelanguagemodels
AT junxuan automaticendtoendchemicalsynthesisdevelopmentplatformpoweredbylargelanguagemodels
AT jianzhangpan automaticendtoendchemicalsynthesisdevelopmentplatformpoweredbylargelanguagemodels
AT qunfang automaticendtoendchemicalsynthesisdevelopmentplatformpoweredbylargelanguagemodels
AT hanyugao automaticendtoendchemicalsynthesisdevelopmentplatformpoweredbylargelanguagemodels
AT xiaodongshen automaticendtoendchemicalsynthesisdevelopmentplatformpoweredbylargelanguagemodels
AT ningye automaticendtoendchemicalsynthesisdevelopmentplatformpoweredbylargelanguagemodels
AT qiangzhang automaticendtoendchemicalsynthesisdevelopmentplatformpoweredbylargelanguagemodels
AT yimingmo automaticendtoendchemicalsynthesisdevelopmentplatformpoweredbylargelanguagemodels