Hessian QM9: A quantum chemistry database of molecular Hessians in implicit solvents
Abstract A significant challenge in computational chemistry is developing approximations that accelerate ab initio methods while preserving accuracy. Machine learning interatomic potentials (MLIPs) have emerged as a promising solution for constructing atomistic potentials that can be transferred acr...
Saved in:
Main Authors: | , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Nature Portfolio
2025-01-01
|
Series: | Scientific Data |
Online Access: | https://doi.org/10.1038/s41597-024-04361-2 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
_version_ | 1841559866654588928 |
---|---|
author | Nicholas J. Williams Lara Kabalan Ljiljana Stojanovic Viktor Zólyomi Edward O. Pyzer-Knapp |
author_facet | Nicholas J. Williams Lara Kabalan Ljiljana Stojanovic Viktor Zólyomi Edward O. Pyzer-Knapp |
author_sort | Nicholas J. Williams |
collection | DOAJ |
description | Abstract A significant challenge in computational chemistry is developing approximations that accelerate ab initio methods while preserving accuracy. Machine learning interatomic potentials (MLIPs) have emerged as a promising solution for constructing atomistic potentials that can be transferred across different molecular and crystalline systems. Most MLIPs are trained only on energies and forces in vacuum, while an improved description of the potential energy surface could be achieved by including the curvature of the potential energy surface. We present Hessian QM9, the first database of equilibrium configurations and numerical Hessian matrices, consisting of 41,645 molecules from the QM9 dataset at the ωB97x/6-31G* level. Molecular Hessians were calculated in vacuum, as well as water, tetrahydrofuran, and toluene using an implicit solvation model. To demonstrate the utility of this dataset, we show that incorporating second derivatives of the potential energy surface into the loss function of a MLIP significantly improves the prediction of vibrational frequencies in all solvent environments, thus making this dataset extremely useful for studying organic molecules in realistic solvent environments for experimental characterization. |
format | Article |
id | doaj-art-1fc0b00376e848d99915d9067b254ff7 |
institution | Kabale University |
issn | 2052-4463 |
language | English |
publishDate | 2025-01-01 |
publisher | Nature Portfolio |
record_format | Article |
series | Scientific Data |
spelling | doaj-art-1fc0b00376e848d99915d9067b254ff72025-01-05T12:08:20ZengNature PortfolioScientific Data2052-44632025-01-011211610.1038/s41597-024-04361-2Hessian QM9: A quantum chemistry database of molecular Hessians in implicit solventsNicholas J. Williams0Lara Kabalan1Ljiljana Stojanovic2Viktor Zólyomi3Edward O. Pyzer-Knapp4IBM ResearchHartree Centre, Science and Technology Facilities Council, Daresbury LaboratoryHartree Centre, Science and Technology Facilities Council, Daresbury LaboratoryHartree Centre, Science and Technology Facilities Council, Daresbury LaboratoryIBM ResearchAbstract A significant challenge in computational chemistry is developing approximations that accelerate ab initio methods while preserving accuracy. Machine learning interatomic potentials (MLIPs) have emerged as a promising solution for constructing atomistic potentials that can be transferred across different molecular and crystalline systems. Most MLIPs are trained only on energies and forces in vacuum, while an improved description of the potential energy surface could be achieved by including the curvature of the potential energy surface. We present Hessian QM9, the first database of equilibrium configurations and numerical Hessian matrices, consisting of 41,645 molecules from the QM9 dataset at the ωB97x/6-31G* level. Molecular Hessians were calculated in vacuum, as well as water, tetrahydrofuran, and toluene using an implicit solvation model. To demonstrate the utility of this dataset, we show that incorporating second derivatives of the potential energy surface into the loss function of a MLIP significantly improves the prediction of vibrational frequencies in all solvent environments, thus making this dataset extremely useful for studying organic molecules in realistic solvent environments for experimental characterization.https://doi.org/10.1038/s41597-024-04361-2 |
spellingShingle | Nicholas J. Williams Lara Kabalan Ljiljana Stojanovic Viktor Zólyomi Edward O. Pyzer-Knapp Hessian QM9: A quantum chemistry database of molecular Hessians in implicit solvents Scientific Data |
title | Hessian QM9: A quantum chemistry database of molecular Hessians in implicit solvents |
title_full | Hessian QM9: A quantum chemistry database of molecular Hessians in implicit solvents |
title_fullStr | Hessian QM9: A quantum chemistry database of molecular Hessians in implicit solvents |
title_full_unstemmed | Hessian QM9: A quantum chemistry database of molecular Hessians in implicit solvents |
title_short | Hessian QM9: A quantum chemistry database of molecular Hessians in implicit solvents |
title_sort | hessian qm9 a quantum chemistry database of molecular hessians in implicit solvents |
url | https://doi.org/10.1038/s41597-024-04361-2 |
work_keys_str_mv | AT nicholasjwilliams hessianqm9aquantumchemistrydatabaseofmolecularhessiansinimplicitsolvents AT larakabalan hessianqm9aquantumchemistrydatabaseofmolecularhessiansinimplicitsolvents AT ljiljanastojanovic hessianqm9aquantumchemistrydatabaseofmolecularhessiansinimplicitsolvents AT viktorzolyomi hessianqm9aquantumchemistrydatabaseofmolecularhessiansinimplicitsolvents AT edwardopyzerknapp hessianqm9aquantumchemistrydatabaseofmolecularhessiansinimplicitsolvents |