Transition1x -- a Dataset for Building Generalizable Reactive Machine Learning Potentials

Schreiner, Mathias; Bhowmik, Arghya; Vegge, Tejs; Busk, Jonas; Winther, Ole

Physics > Chemical Physics

arXiv:2207.12858 (physics)

[Submitted on 25 Jul 2022 (v1), last revised 1 Sep 2022 (this version, v2)]

Title:Transition1x -- a Dataset for Building Generalizable Reactive Machine Learning Potentials

Authors:Mathias Schreiner, Arghya Bhowmik, Tejs Vegge, Jonas Busk, Ole Winther

View PDF

Abstract:Machine Learning (ML) models have, in contrast to their usefulness in molecular dynamics studies, had limited success as surrogate potentials for reaction barrier search. It is due to the scarcity of training data in relevant transition state regions of chemical space. Currently, available datasets for training ML models on small molecular systems almost exclusively contain configurations at or near equilibrium. In this work, we present the dataset Transition1x containing 9.6 million Density Functional Theory (DFT) calculations of forces and energies of molecular configurations on and around reaction pathways at the wB97x/6-31G(d) level of theory. The data was generated by running Nudged Elastic Band (NEB) calculations with DFT on 10k reactions while saving intermediate calculations. We train state-of-the-art equivariant graph message-passing neural network models on Transition1x and cross-validate on the popular ANI1x and QM9 datasets. We show that ML models cannot learn features in transition-state regions solely by training on hitherto popular benchmark datasets. Transition1x is a new challenging benchmark that will provide an important step towards developing next-generation ML force fields that also work far away from equilibrium configurations and reactive systems.

Subjects:	Chemical Physics (physics.chem-ph); Artificial Intelligence (cs.AI); Computational Physics (physics.comp-ph)
Cite as:	arXiv:2207.12858 [physics.chem-ph]
	(or arXiv:2207.12858v2 [physics.chem-ph] for this version)
	https://doi.org/10.48550/arXiv.2207.12858

Submission history

From: Mathias Schreiner [view email]
[v1] Mon, 25 Jul 2022 07:30:14 UTC (422 KB)
[v2] Thu, 1 Sep 2022 10:09:43 UTC (736 KB)

Physics > Chemical Physics

Title:Transition1x -- a Dataset for Building Generalizable Reactive Machine Learning Potentials

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Physics > Chemical Physics

Title:Transition1x -- a Dataset for Building Generalizable Reactive Machine Learning Potentials

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators