Predict new structure and collect outputs
Overview before starting
We are going to
- Predict electron density by SALTED on water dimers (having trained only on monomers).
- Calculate derived properties from the predictions by FHI-aims.
- Compare the results with the reference values from FHI-aims.
Related starting files
file or dir name | description |
---|---|
README.rst |
README file for your reference |
inp.yaml |
SALTED input file, consists of file paths and hyperparameters |
control_read_setup.in |
FHI-aims control file, for preparing basis info for reordering data |
control_read.in |
FHI-aims control file, for predicting properties by SALTED outputs |
run-aims-predict.sbatch |
Example sbatch script, for predicting properties by SALTED outputs |
water_dimers_10.xyz |
xyz file, predicting dataset |
Predict densities of new structures
We will use water_dimers_10.xyz
as prediction dataset.
Before running the prediction, check the inp.yaml
file and make sure the inp.prediction.filename
entry is water_dimers_10.xyz
.
If you want to predict other structures, you can prepare your own xyz
file and change the inp.prediction.filename
to your own xyz
file name.
Never forget this step, or we will be predicting on wrong structures.
To conduct the prediction, run
mpirun -np $ntasks python -m salted.prediction
and predicted DF coefficients are stored at predictions_[inp.salted.saltedname/M[inp.gpr.Menv]_zeta[inp.gpr.z]/N[ntrain]_reg[inp.gpr.regul]/COEFFS-[n].dat
with [n]
for the \(n\)-th configuration in water_dimers_10.xyz
. Note that the COEFFS files are 1-indexed, unlike the numpy files produced during Part 2 which are 0-indexed. ntrain
is given by inp.gpr.Ntrain * inp.gpr.trainfrac
.
Calculate derived properties of new structures
The example sbatch script run-aims-predict.sbatch
is provided for reference and will execute the commands described in this section; you should adapt it to your own computer.
Predict electron density by SALTED
It is recommended to use AIMS version >= 240403 when performing predictions based on SALTED coefficients, as this significantly simplifies the workflow.
To set up the AIMS calculations, you should first run salted.aims.make_geoms --predict
; the --predict
flag here creates an AIMS geometry file for each structure in the prediction set in the folder inp.predict_data/geoms
. You should then create directories for each prediction calculation, and copy control_read.in
to control.in
for each directory along with the corresponding geometry.in
file. Finally, the the coefficients need to be added to each working directory for the AIMS calculations. This is achieved by running salted.aims.move_data_in
. This will produce a file ri_restart_coeffs_predicted.out
in each directory, which will need to be renamed to ri_restart_coeffs.out
prior to running AIMS.
Instructions when using AIMS version < 240403
Before reading in the predicted coefficients in FHI-aims, we need to reorder the DF coefficients from SALTED to conform with the spherical-harmonic convention in FHI-aims. To get the necessary files, we will prerun FHI-aims with control_read_setup.in
and geometry.in
in the working dir. salted.aims.move_data_in_reorder
performs the reordering based on idx_prodbas.out
and prodbas_condon_shotley_list.out
from the FHI-aims prerun, and the output file names are ri_restart_coeffs_predicted.out
.
The example sbatch script run-aims-predict-reorder.sbatch
is provided only for reference, and you should adapt to your own computer. Have a look at the file control_read_setup.in
to understand the RI-related flags needed in this case.
sbatch run-aims-predict.sbatch
For each FHI-aims calculation, the script should
- Copy
control_read_setup.in
tocontrol.in
andgeometry.in
to the working dir. - Run FHI-aims prerun across all structures.
- During prerun, FHI-aims will just generate
idx_prodbas.out
andprodbas_condon_shotley_list.out
for reordering and rephasing DF coefficients. - You might see FHI-aims outputs ending like this:
and this can be ignored, as the code just stops after generating the necessary lists, without doing anything else.
An error led to a call to aims_stop_coll, but without a specific message. A detailed message may be in another file
- During prerun, FHI-aims will just generate
- Run
python -m salted.move_data_in
. - Rename
ri_restart_coeffs_predicted.out
tori_restart_coeffs.out
. - Run FHI-aims a prediction calculation across all structures.
- There will only be one SCF iteration (one diagonalization).
- Rename
rho_rebuilt_ri.out
torho_ml.out
, renameri_restart_coeffs.out
tori_restart_coeffs_ml.out
- We have explained the reason in Part 1.
rho_ml.out
s and ri_restart_coeffs_ml.out
s contain the predicted densities and DF coefficients.
Validate the predicted densities
To further check the prediction results, we can compare the predicted densities with the reference values from FHI-aims. We will reuse the script run-aims.sbatch
to calculate these references, but do remember to change bash variable $DATADIR
to [inp.qm.path2qm]+[inp.prediction.predict_data]
. You can also comment the line ri_full_output .True.
in control.in
to avoid outputting the overlap matrix for the predicted structures.
sbatch run-aims.sbatch
The output files rho_scf.out
contain the reference ab initio densities for each structure.
Then run
python -m salted.aims.get_ml_err
to compare rho_ml.out
with rho_scf.out
.
Real-space integral of the absolute error is stored in ml_maes
for each structure,
and the total mean absolute error is printed to terminal.
Get physical properties
To get physical properties (like electrostatic energy, XC energy and total energies per atom) from AIMS output, run
python -m salted.aims.collect_energies
The properties are written to files predict_reference_*
, with the predicted properties in the first column and reference values in the second column.