A machine learning potential construction based on radial distribution function sampling

© 2024 The Author(s). Journal of Computational Chemistry published by Wiley Periodicals LLC.

Bibliographische Detailangaben
Veröffentlicht in:Journal of computational chemistry. - 1984. - 45(2024), 32 vom: 15. Nov., Seite 2949-2958
1. Verfasser: Watanabe, Natsuki (VerfasserIn)
Weitere Verfasser: Hori, Yuta, Sugisawa, Hiroki, Ida, Tomonori, Shoji, Mitsuo, Shigeta, Yasuteru
Format: Online-Aufsatz
Sprache:English
Veröffentlicht: 2024
Zugriff auf das übergeordnete Werk:Journal of computational chemistry
Schlagworte:Journal Article machine learning potential molecular cluster quantum chemical calculations radial distribution function training data sampling
Beschreibung
Zusammenfassung:© 2024 The Author(s). Journal of Computational Chemistry published by Wiley Periodicals LLC.
Sampling reference data is crucial in machine learning potential (MLP) construction. Inadequate coverage of local configurations in reference data may lead to unphysical behaviors in MLP-based molecular dynamics (MLP-MD) simulations. To address this problem, this study proposes a new on-the-fly reference data sampling method called radial distribution function (RDF)-based data sampling for MLP construction. This method detects and extracts anomalous structures from the trajectories of MLP-MD simulations by focusing on the shapes of RDFs. The detected structures are added to the reference data to improve the accuracy of the MLP. This method allows us to realize a reasonable MLP construction for liquid water with minimal additional data. We prepare data from an H2O molecular cluster system and verify whether the constructed MLPs are practical for bulk water systems. MLP-MD simulations without RDF-based data sampling show unphysical behaviors, such as atomic collisions. In contrast, after applying this method, we obtain MLP-MD trajectories with features, such as RDF shapes and angle distributions, that are comparable to those of ab initio MD simulations. Our simulation results demonstrate that the RDF-based data sampling approach is useful for constructing MLPs that are robust to extrapolations from molecular cluster systems to bulk systems without any specialized know-how
Beschreibung:Date Revised 08.11.2024
published: Print-Electronic
Citation Status PubMed-not-MEDLINE
ISSN:1096-987X
DOI:10.1002/jcc.27497