NeurIPS2023
Benchmark of Machine Learning Force Fields for Semiconductor Simulations: Datasets, Metrics, and Comparative Analysis
Geonu Kim, Byunggook Na, Gunhee Kim, Hyuntae Cho, Seungjin Kang, Hee Sun Lee, Saerom Choi, Heejae Kim, Seungwon Lee, Yongdeok Kim
被引用 8 次
摘要
As semiconductor devices become miniaturized and their structures become more complex, there is a growing need for large-scale atomic-level simulations as a less costly alternative to the trial-and-error approach during development. Although machine learning force fields (MLFFs) can meet the accuracy and scale requirements for such simulations, there are no open-access benchmarks for semiconductor materials. Hence, this study presents a comprehensive benchmark suite that consists of two semiconductor material datasets and ten MLFF models with six evaluation metrics. We select two important semiconductor thin-film materials silicon nitride and hafnium oxide, and generate their datasets using computationally expensive density functional theory simulations under various scenarios at a cost of 2.6k GPU days. Additionally, we provide a variety of architectures as baselines: descriptor-based fully connected neural networks and graph neural networks with rotational invariant or equivariant features. We assess not only the accuracy of energy and force predictions but also five additional simulation indicators to determine the practical applicability of MLFF models in molecular dynamics simulations. To facilitate further research, our benchmark suite is available at https://github.com/SAITPublic/MLFF-Framework . * Equal contribution. † Co-corresponding Author. 37th Conference on Neural Information Processing Systems (NeurIPS 2023) Track on Datasets and Benchmarks. Obtaining reliable datasets and benchmarks for studying semiconductors in the condensed phase presents significant challenges, which hinders the development of accurate MLFF models. Generating precise datasets for condensed-phase materials is challenging because of the complex structures, dynamics, and large number of atoms involved. To foster the development of MLFFs in the field, we introduce two new datasets, which are specifically designed for semiconductor advanced materials discovery, called SAMD23 datasets: silicon nitride (SiN) and hafnium oxide (HfO). We conducted DFT simulations under various conditions, including initial structures, stoichiometry, temperature, strain, and defects, resulting in a cost of 2.6k GPU days. Although we built datasets with broad coverage, dynamic simulations exhibit an enormously wide range of atomic configurations with high degrees of freedom, indicating that attempting to collect all these configurations through computationally expensive ab initio simulations is not feasible in practice. Hence, MLFF models must be capable of extrapolation, which enables them to yield reliable predictions of configurations that are absent from the training dataset. Generally, to assess the extrapolation capability, an evaluation of the energy and force on out-of-distribution (OOD) test datasets, in addition to in-distribution (ID) sets, is employed. However, the energy and force errors may not be sufficient to account for the simulation behavior [8, 9] . Thus, we additionally provide five simulation indicators and correspondingly prepare material structures for both ID and OOD sets, facilitating a comprehensive comparison of the MLFF models. Moreover, we offer a consolidated framework that streamlines model development, training, and evaluation processes into a unified platform. We curated diverse models that utilize hand-crafted features as atomic representations or employ graph neural networks as feature extractors. For the benchmark, 10 MLFF models were trained and evaluated using the framework, suggesting a reliable model selection policy based on the simulation indicators. Based on a comparative analysis of the training models with various hyperparameters, we suggest a baseline training recipe. This paper makes the following contributions: • We introduce SAMD23, two new MLFF benchmark datasets that reflect semiconductor simulations of SiN and HfO under various scenarios. • We provide a framework to facilitate model development, and present benchmarks for SiN and HfO, along with five simulation indicators to assess the prediction performance in simulations and the extrapolation capability. • We suggest a baseline training recipe and model selection policy to employ the model for simulations by performing a comparative analysis of 10 MLFF models.