# Electronic Structure Machine Learning with SALTED

**Prepared by Zekun Lou, Alan Lewis and Mariana Rossi.**

Ab initio electronic structure methods, such as density functional theory, are computationally costly, rendering them impractical for large systems. In contrast, data-driven machine learning methods represent a good alternative for predicting the electornic structure at a low computational cost, allowing for "ab initio accuracy" in large systems.

In this context, the **Symmetry-Adapted Learning of Three-Dimensional Electron Densities** (SALTED) method
(see 10.1021/acs.jctc.1c00576 and 10.1021/acs.jctc.2c00850)
is designed to predict electron densities in periodic systems using a locally and symmetry-adapted representation of the density field.
The approach employs a resolution of the identity, or density fitting (DF), method to project electron densities onto a new basis sets formed by prducts of numeric atomic orbitals (NAO).
Then the Symmetry-Adapted Gaussian Process Regression (SAGPR) method is applied to learn density fitting coefficients from a dataset comprising smaller systems, showing good capabilities of extrapolating to larger systems.

## Summary of the tutorial

Note: Aliases and assumptions in this tutorial.

Variables enclosed in `[ ]`

indicate values that need evaluation.
E.g. `[inp.gpr.Menv]`

denotes an input parameter in the input file `inp.yaml`

.

Variables starting with `$`

represent shell variables.
E.g. `$ntasks`

means the number of tasks in an MPI job.

The example HPC batch scripts are written for the SLURM scheduler. If you are using other HPC schedulers or using PC, please rewrite the scripts/commands accordingly.

In this tutorial we will use FHI-aims to generate an *ab initio* dataset of electronic densities of water monomers.
Subsequently, we will apply the SALTED machine-learning model (implemented in Python)
to learn these densities and predict the electron density of water dimers.

Section **Preparations** introduces related papers and the setup process for both FHI-aims and SALTED.

**Part 1** introduces some theory behind the machine learning techniques and numerical methods.

**Part 2** introduces the workflow of SALTED and focuses on how to generate data for SALTED with FHI-aims,
and the dataset will be used in **Part 3**.

**Part 3** shows how to generate descriptors, how to perform regression with SALTED and then validate the model.

**Part 4** shows how to use trained SALTED model to predict electron densities of new structures,
and further derive physical properties from the prediction by FHI-aims.

**Appendix** details the input file for SALTED.

**References** consists of all literature mentioned in this tutorial.

## Location of the tutorial material

All files related to this tutorial can be found here.