Electronic Structure Machine Learning with SALTED
Prepared by Zekun Lou, Alan Lewis and Mariana Rossi.
Ab initio electronic structure methods, such as density functional theory, are computationally costly, rendering them impractical for large systems. In contrast, data-driven machine learning methods represent a good alternative for predicting the electornic structure at a low computational cost, allowing for "ab initio accuracy" in large systems.
In this context, the Symmetry-Adapted Learning of Three-Dimensional Electron Densities (SALTED) method (see 10.1021/acs.jctc.1c00576 and 10.1021/acs.jctc.2c00850) is designed to predict electron densities in periodic systems using a locally and symmetry-adapted representation of the density field. The approach employs a resolution of the identity, or density fitting (DF), method to project electron densities onto a set of atom-centred functions. Then the Symmetry-Adapted Gaussian Process Regression (SAGPR) method is applied to learn and predict density fitting coefficients and hence the electron density of molecules and materials.
For more detail, and instructions on using SALTED with other electronic software packages, please refer to the SALTED documentation.
Summary of the tutorial
In this tutorial we will use FHI-aims to generate an ab initio dataset of electronic densities of water monomers. Subsequently, we will apply the SALTED machine-learning model (implemented in Python) to learn these densities and predict the electron density of water dimers.
The Installation section outlines the setup process for both FHI-aims and SALTED.
The Theory section introduces some theory behind the machine learning techniques and numerical methods employed in SALTED.
The Workflow section outlines the general workflow for using SALTED.
Part 1 focuses on how to generate data for SALTED with FHI-aims.
Part 2 shows how to generate descriptors, how to perform regression with SALTED and then validate the model.
Part 3 shows how to use trained SALTED model to predict electron densities of new structures, and further derive physical properties from the prediction by FHI-aims.
Appendix details the input file for SALTED.
References consists of all literature mentioned in this tutorial.
Note: Aliases and assumptions in this tutorial.
Variables enclosed in [ ]
indicate values that need evaluation.
E.g. [inp.gpr.Menv]
denotes an input parameter in the input file inp.yaml
.
Variables starting with $
represent shell variables.
E.g. $ntasks
means the number of tasks in an MPI job.
The example HPC batch scripts are written for the SLURM scheduler. If you are using other HPC schedulers or using PC, please rewrite the scripts/commands accordingly.
Location of the tutorial material
All files related to this tutorial can be found in the SALTED GitHub repository in the folder example/water_monomer_AIMS
.