DPA-2: a large atomic model as a multi-task learner

Abstract The rapid advancements in artificial intelligence (AI) are catalyzing transformative changes in atomic modeling, simulation, and design. AI-driven potential energy models have demonstrated the capability to conduct large-scale, long-duration simulations with the accuracy of ab initio electr...

Full description

Saved in:
Bibliographic Details
Main Authors: Duo Zhang, Xinzijian Liu, Xiangyu Zhang, Chengqian Zhang, Chun Cai, Hangrui Bi, Yiming Du, Xuejian Qin, Anyang Peng, Jiameng Huang, Bowen Li, Yifan Shan, Jinzhe Zeng, Yuzhi Zhang, Siyuan Liu, Yifan Li, Junhan Chang, Xinyan Wang, Shuo Zhou, Jianchuan Liu, Xiaoshan Luo, Zhenyu Wang, Wanrun Jiang, Jing Wu, Yudi Yang, Jiyuan Yang, Manyi Yang, Fu-Qiang Gong, Linshuang Zhang, Mengchao Shi, Fu-Zhi Dai, Darrin M. York, Shi Liu, Tong Zhu, Zhicheng Zhong, Jian Lv, Jun Cheng, Weile Jia, Mohan Chen, Guolin Ke, Weinan E, Linfeng Zhang, Han Wang
Format: Article
Language:English
Published: Nature Portfolio 2024-12-01
Series:npj Computational Materials
Online Access:https://doi.org/10.1038/s41524-024-01493-2
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1846112444889432064
author Duo Zhang
Xinzijian Liu
Xiangyu Zhang
Chengqian Zhang
Chun Cai
Hangrui Bi
Yiming Du
Xuejian Qin
Anyang Peng
Jiameng Huang
Bowen Li
Yifan Shan
Jinzhe Zeng
Yuzhi Zhang
Siyuan Liu
Yifan Li
Junhan Chang
Xinyan Wang
Shuo Zhou
Jianchuan Liu
Xiaoshan Luo
Zhenyu Wang
Wanrun Jiang
Jing Wu
Yudi Yang
Jiyuan Yang
Manyi Yang
Fu-Qiang Gong
Linshuang Zhang
Mengchao Shi
Fu-Zhi Dai
Darrin M. York
Shi Liu
Tong Zhu
Zhicheng Zhong
Jian Lv
Jun Cheng
Weile Jia
Mohan Chen
Guolin Ke
Weinan E
Linfeng Zhang
Han Wang
author_facet Duo Zhang
Xinzijian Liu
Xiangyu Zhang
Chengqian Zhang
Chun Cai
Hangrui Bi
Yiming Du
Xuejian Qin
Anyang Peng
Jiameng Huang
Bowen Li
Yifan Shan
Jinzhe Zeng
Yuzhi Zhang
Siyuan Liu
Yifan Li
Junhan Chang
Xinyan Wang
Shuo Zhou
Jianchuan Liu
Xiaoshan Luo
Zhenyu Wang
Wanrun Jiang
Jing Wu
Yudi Yang
Jiyuan Yang
Manyi Yang
Fu-Qiang Gong
Linshuang Zhang
Mengchao Shi
Fu-Zhi Dai
Darrin M. York
Shi Liu
Tong Zhu
Zhicheng Zhong
Jian Lv
Jun Cheng
Weile Jia
Mohan Chen
Guolin Ke
Weinan E
Linfeng Zhang
Han Wang
author_sort Duo Zhang
collection DOAJ
description Abstract The rapid advancements in artificial intelligence (AI) are catalyzing transformative changes in atomic modeling, simulation, and design. AI-driven potential energy models have demonstrated the capability to conduct large-scale, long-duration simulations with the accuracy of ab initio electronic structure methods. However, the model generation process remains a bottleneck for large-scale applications. We propose a shift towards a model-centric ecosystem, wherein a large atomic model (LAM), pre-trained across multiple disciplines, can be efficiently fine-tuned and distilled for various downstream tasks, thereby establishing a new framework for molecular modeling. In this study, we introduce the DPA-2 architecture as a prototype for LAMs. Pre-trained on a diverse array of chemical and materials systems using a multi-task approach, DPA-2 demonstrates superior generalization capabilities across multiple downstream tasks compared to the traditional single-task pre-training and fine-tuning methodologies. Our approach sets the stage for the development and broad application of LAMs in molecular and materials simulation research.
format Article
id doaj-art-d3eb156d11c54ea2b5e9a0ffab86cfb3
institution Kabale University
issn 2057-3960
language English
publishDate 2024-12-01
publisher Nature Portfolio
record_format Article
series npj Computational Materials
spelling doaj-art-d3eb156d11c54ea2b5e9a0ffab86cfb32024-12-22T12:36:51ZengNature Portfolionpj Computational Materials2057-39602024-12-0110111510.1038/s41524-024-01493-2DPA-2: a large atomic model as a multi-task learnerDuo Zhang0Xinzijian Liu1Xiangyu Zhang2Chengqian Zhang3Chun Cai4Hangrui Bi5Yiming Du6Xuejian Qin7Anyang Peng8Jiameng Huang9Bowen Li10Yifan Shan11Jinzhe Zeng12Yuzhi Zhang13Siyuan Liu14Yifan Li15Junhan Chang16Xinyan Wang17Shuo Zhou18Jianchuan Liu19Xiaoshan Luo20Zhenyu Wang21Wanrun Jiang22Jing Wu23Yudi Yang24Jiyuan Yang25Manyi Yang26Fu-Qiang Gong27Linshuang Zhang28Mengchao Shi29Fu-Zhi Dai30Darrin M. York31Shi Liu32Tong Zhu33Zhicheng Zhong34Jian Lv35Jun Cheng36Weile Jia37Mohan Chen38Guolin Ke39Weinan E40Linfeng Zhang41Han Wang42AI for Science InstituteAI for Science InstituteState Key Lab of Processors, Institute of Computing Technology, Chinese Academy of SciencesDP TechnologyAI for Science InstituteAI for Science InstituteState Key Lab of Processors, Institute of Computing Technology, Chinese Academy of SciencesNingbo Institute of Materials Technology and Engineering, Chinese Academy of SciencesAI for Science InstituteDP TechnologyShanghai Engineering Research Center of Molecular Therapeutics & New Drug Development, School of Chemistry and Molecular Engineering, East China Normal UniversityNingbo Institute of Materials Technology and Engineering, Chinese Academy of SciencesLaboratory for Biomolecular Simulation Research, Institute for Quantitative Biomedicine and Department of Chemistry and Chemical Biology, Rutgers UniversityDP TechnologyDP TechnologyDepartment of Chemistry, Princeton UniversityDP TechnologyDP TechnologyDP TechnologySchool of Electrical Engineering and Electronic Information, Xihua UniversityState Key Laboratory of Superhard Materials, College of Physics, Jilin UniversityKey Laboratory of Material Simulation Methods & Software of Ministry of Education, College of Physics, Jilin UniversityAI for Science InstituteKey Laboratory for Quantum Materials of Zhejiang Province, Department of Physics, School of Science, Westlake UniversityKey Laboratory for Quantum Materials of Zhejiang Province, Department of Physics, School of Science, Westlake UniversityKey Laboratory for Quantum Materials of Zhejiang Province, Department of Physics, School of Science, Westlake UniversityAtomistic Simulations, Italian Institute of TechnologyState Key Laboratory of Physical Chemistry of Solid Surface, iChEM, College of Chemistry and Chemical Engineering, Xiamen UniversityDP TechnologyDP TechnologyAI for Science InstituteLaboratory for Biomolecular Simulation Research, Institute for Quantitative Biomedicine and Department of Chemistry and Chemical Biology, Rutgers UniversityKey Laboratory for Quantum Materials of Zhejiang Province, Department of Physics, School of Science, Westlake UniversityShanghai Engineering Research Center of Molecular Therapeutics & New Drug Development, School of Chemistry and Molecular Engineering, East China Normal UniversityNingbo Institute of Materials Technology and Engineering, Chinese Academy of SciencesKey Laboratory of Material Simulation Methods & Software of Ministry of Education, College of Physics, Jilin UniversityState Key Laboratory of Physical Chemistry of Solid Surface, iChEM, College of Chemistry and Chemical Engineering, Xiamen UniversityState Key Lab of Processors, Institute of Computing Technology, Chinese Academy of SciencesAI for Science InstituteDP TechnologyAI for Science InstituteAI for Science InstituteHEDPS, CAPT, College of Engineering, Peking UniversityAbstract The rapid advancements in artificial intelligence (AI) are catalyzing transformative changes in atomic modeling, simulation, and design. AI-driven potential energy models have demonstrated the capability to conduct large-scale, long-duration simulations with the accuracy of ab initio electronic structure methods. However, the model generation process remains a bottleneck for large-scale applications. We propose a shift towards a model-centric ecosystem, wherein a large atomic model (LAM), pre-trained across multiple disciplines, can be efficiently fine-tuned and distilled for various downstream tasks, thereby establishing a new framework for molecular modeling. In this study, we introduce the DPA-2 architecture as a prototype for LAMs. Pre-trained on a diverse array of chemical and materials systems using a multi-task approach, DPA-2 demonstrates superior generalization capabilities across multiple downstream tasks compared to the traditional single-task pre-training and fine-tuning methodologies. Our approach sets the stage for the development and broad application of LAMs in molecular and materials simulation research.https://doi.org/10.1038/s41524-024-01493-2
spellingShingle Duo Zhang
Xinzijian Liu
Xiangyu Zhang
Chengqian Zhang
Chun Cai
Hangrui Bi
Yiming Du
Xuejian Qin
Anyang Peng
Jiameng Huang
Bowen Li
Yifan Shan
Jinzhe Zeng
Yuzhi Zhang
Siyuan Liu
Yifan Li
Junhan Chang
Xinyan Wang
Shuo Zhou
Jianchuan Liu
Xiaoshan Luo
Zhenyu Wang
Wanrun Jiang
Jing Wu
Yudi Yang
Jiyuan Yang
Manyi Yang
Fu-Qiang Gong
Linshuang Zhang
Mengchao Shi
Fu-Zhi Dai
Darrin M. York
Shi Liu
Tong Zhu
Zhicheng Zhong
Jian Lv
Jun Cheng
Weile Jia
Mohan Chen
Guolin Ke
Weinan E
Linfeng Zhang
Han Wang
DPA-2: a large atomic model as a multi-task learner
npj Computational Materials
title DPA-2: a large atomic model as a multi-task learner
title_full DPA-2: a large atomic model as a multi-task learner
title_fullStr DPA-2: a large atomic model as a multi-task learner
title_full_unstemmed DPA-2: a large atomic model as a multi-task learner
title_short DPA-2: a large atomic model as a multi-task learner
title_sort dpa 2 a large atomic model as a multi task learner
url https://doi.org/10.1038/s41524-024-01493-2
work_keys_str_mv AT duozhang dpa2alargeatomicmodelasamultitasklearner
AT xinzijianliu dpa2alargeatomicmodelasamultitasklearner
AT xiangyuzhang dpa2alargeatomicmodelasamultitasklearner
AT chengqianzhang dpa2alargeatomicmodelasamultitasklearner
AT chuncai dpa2alargeatomicmodelasamultitasklearner
AT hangruibi dpa2alargeatomicmodelasamultitasklearner
AT yimingdu dpa2alargeatomicmodelasamultitasklearner
AT xuejianqin dpa2alargeatomicmodelasamultitasklearner
AT anyangpeng dpa2alargeatomicmodelasamultitasklearner
AT jiamenghuang dpa2alargeatomicmodelasamultitasklearner
AT bowenli dpa2alargeatomicmodelasamultitasklearner
AT yifanshan dpa2alargeatomicmodelasamultitasklearner
AT jinzhezeng dpa2alargeatomicmodelasamultitasklearner
AT yuzhizhang dpa2alargeatomicmodelasamultitasklearner
AT siyuanliu dpa2alargeatomicmodelasamultitasklearner
AT yifanli dpa2alargeatomicmodelasamultitasklearner
AT junhanchang dpa2alargeatomicmodelasamultitasklearner
AT xinyanwang dpa2alargeatomicmodelasamultitasklearner
AT shuozhou dpa2alargeatomicmodelasamultitasklearner
AT jianchuanliu dpa2alargeatomicmodelasamultitasklearner
AT xiaoshanluo dpa2alargeatomicmodelasamultitasklearner
AT zhenyuwang dpa2alargeatomicmodelasamultitasklearner
AT wanrunjiang dpa2alargeatomicmodelasamultitasklearner
AT jingwu dpa2alargeatomicmodelasamultitasklearner
AT yudiyang dpa2alargeatomicmodelasamultitasklearner
AT jiyuanyang dpa2alargeatomicmodelasamultitasklearner
AT manyiyang dpa2alargeatomicmodelasamultitasklearner
AT fuqianggong dpa2alargeatomicmodelasamultitasklearner
AT linshuangzhang dpa2alargeatomicmodelasamultitasklearner
AT mengchaoshi dpa2alargeatomicmodelasamultitasklearner
AT fuzhidai dpa2alargeatomicmodelasamultitasklearner
AT darrinmyork dpa2alargeatomicmodelasamultitasklearner
AT shiliu dpa2alargeatomicmodelasamultitasklearner
AT tongzhu dpa2alargeatomicmodelasamultitasklearner
AT zhichengzhong dpa2alargeatomicmodelasamultitasklearner
AT jianlv dpa2alargeatomicmodelasamultitasklearner
AT juncheng dpa2alargeatomicmodelasamultitasklearner
AT weilejia dpa2alargeatomicmodelasamultitasklearner
AT mohanchen dpa2alargeatomicmodelasamultitasklearner
AT guolinke dpa2alargeatomicmodelasamultitasklearner
AT weinane dpa2alargeatomicmodelasamultitasklearner
AT linfengzhang dpa2alargeatomicmodelasamultitasklearner
AT hanwang dpa2alargeatomicmodelasamultitasklearner