NEPMaker: Active learning of neuroevolution machine learning potential for large cells
Junjie Wang, Shuning Pan, Haoting Zhang, Qiuhan Jia, Chi Ding, Zheyong Fan, Jian Sun

TL;DR
NEPMaker introduces an active learning framework for neuroevolution potentials that enhances large-scale materials simulations by reducing extrapolation errors and improving model robustness.
Contribution
It develops a D-optimality-driven active learning method integrated into GPUMD, enabling scalable, on-the-fly identification of atomic environments for large systems.
Findings
Reduces extrapolation errors in large-scale simulations.
Improves model robustness and transferability.
Enables construction of reliable ML potentials for complex materials.
Abstract
Machine learning potentials (MLPs) achieve near first-principles accuracy but often fail for atomic environments outside the training distribution. Active learning can mitigate this limitation; however, its application to large-scale simulations is hindered by the prohibitive cost of labeling entire configurations. Here, we develop a D-optimality-driven active learning framework for the neuroevolution potential (NEP) implemented within the GPUMD package, named NEPMaker. Extrapolative atomic environments are identified on-the-fly and embedded into locally periodic structures, where boundary atoms are optimized to remain close to the training distribution. This strategy enables large-scale simulations to directly contribute to dataset construction, significantly reducing extrapolation errors while improving model robustness and transferability. The proposed framework provides a scalable…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
