Crystal structure generation with autoregressive large language modeling
Abstract The generation of plausible crystal structures is often the first step in predicting the structure and properties of a material from its chemical composition. However, most current methods for crystal structure prediction are computationally expensive, slowing the pace of innovation. Seedin...
Saved in:
| Main Authors: | , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
Nature Portfolio
2024-12-01
|
| Series: | Nature Communications |
| Online Access: | https://doi.org/10.1038/s41467-024-54639-7 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1846137020332638208 |
|---|---|
| author | Luis M. Antunes Keith T. Butler Ricardo Grau-Crespo |
| author_facet | Luis M. Antunes Keith T. Butler Ricardo Grau-Crespo |
| author_sort | Luis M. Antunes |
| collection | DOAJ |
| description | Abstract The generation of plausible crystal structures is often the first step in predicting the structure and properties of a material from its chemical composition. However, most current methods for crystal structure prediction are computationally expensive, slowing the pace of innovation. Seeding structure prediction algorithms with quality generated candidates can overcome a major bottleneck. Here, we introduce CrystaLLM, a methodology for the versatile generation of crystal structures, based on the autoregressive large language modeling (LLM) of the Crystallographic Information File (CIF) format. Trained on millions of CIF files, CrystaLLM focuses on modeling crystal structures through text. CrystaLLM can produce plausible crystal structures for a wide range of inorganic compounds unseen in training, as demonstrated by ab initio simulations. Our approach challenges conventional representations of crystals, and demonstrates the potential of LLMs for learning effective models of crystal chemistry, which will lead to accelerated discovery and innovation in materials science. |
| format | Article |
| id | doaj-art-de7409d068ed45c5acd8350760693f9f |
| institution | Kabale University |
| issn | 2041-1723 |
| language | English |
| publishDate | 2024-12-01 |
| publisher | Nature Portfolio |
| record_format | Article |
| series | Nature Communications |
| spelling | doaj-art-de7409d068ed45c5acd8350760693f9f2024-12-08T12:35:37ZengNature PortfolioNature Communications2041-17232024-12-0115111610.1038/s41467-024-54639-7Crystal structure generation with autoregressive large language modelingLuis M. Antunes0Keith T. Butler1Ricardo Grau-Crespo2Department of Chemistry, University of ReadingDepartment of Chemistry, University College LondonDepartment of Chemistry, University of ReadingAbstract The generation of plausible crystal structures is often the first step in predicting the structure and properties of a material from its chemical composition. However, most current methods for crystal structure prediction are computationally expensive, slowing the pace of innovation. Seeding structure prediction algorithms with quality generated candidates can overcome a major bottleneck. Here, we introduce CrystaLLM, a methodology for the versatile generation of crystal structures, based on the autoregressive large language modeling (LLM) of the Crystallographic Information File (CIF) format. Trained on millions of CIF files, CrystaLLM focuses on modeling crystal structures through text. CrystaLLM can produce plausible crystal structures for a wide range of inorganic compounds unseen in training, as demonstrated by ab initio simulations. Our approach challenges conventional representations of crystals, and demonstrates the potential of LLMs for learning effective models of crystal chemistry, which will lead to accelerated discovery and innovation in materials science.https://doi.org/10.1038/s41467-024-54639-7 |
| spellingShingle | Luis M. Antunes Keith T. Butler Ricardo Grau-Crespo Crystal structure generation with autoregressive large language modeling Nature Communications |
| title | Crystal structure generation with autoregressive large language modeling |
| title_full | Crystal structure generation with autoregressive large language modeling |
| title_fullStr | Crystal structure generation with autoregressive large language modeling |
| title_full_unstemmed | Crystal structure generation with autoregressive large language modeling |
| title_short | Crystal structure generation with autoregressive large language modeling |
| title_sort | crystal structure generation with autoregressive large language modeling |
| url | https://doi.org/10.1038/s41467-024-54639-7 |
| work_keys_str_mv | AT luismantunes crystalstructuregenerationwithautoregressivelargelanguagemodeling AT keithtbutler crystalstructuregenerationwithautoregressivelargelanguagemodeling AT ricardograucrespo crystalstructuregenerationwithautoregressivelargelanguagemodeling |