Saving SCF results on disk and SCF checkpoints

For longer DFT calculations it is pretty standard to run them on a cluster in advance and to perform postprocessing (band structure calculation, plotting of density, etc.) at a later point and potentially on a different machine.

To support such workflows DFTK offers the two functions save_scfres and load_scfres, which allow to save the data structure returned by self_consistent_field on disk or retrieve it back into memory, respectively. For this purpose DFTK uses the JLD2.jl file format and Julia package. For the moment this process is considered an experimental feature and has a number of caveats, see the warnings below.

Saving `scfres` is experimental

The load_scfres and save_scfres pair of functions are experimental features. This means:

  • The interface of these functions as well as the format in which the data is stored on disk can change incompatibly in the future. At this point we make no promises ...
  • JLD2 is not yet completely matured and it is recommended to only use it for short-term storage and not to archive scientific results.
  • If you are using the functions to transfer data between different machines ensure that you use the same version of Julia, JLD2 and DFTK for saving and loading data.

To illustrate the use of the functions in practice we will compute the total energy of the O₂ molecule at PBE level. To get the triplet ground state we use a collinear spin polarisation (see Collinear spin and magnetic systems for details) and a bit of temperature to ease convergence:

using DFTK
using LinearAlgebra
using JLD2

d = 2.079  # oxygen-oxygen bondlength
a = 9.0    # size of the simulation box
lattice = diagm(a * ones(3))
O = ElementPsp(:O, psp=load_psp("hgh/pbe/O-q6.hgh"))
atoms = [O => d / 2a * [[0, 0, 1], [0, 0, -1]]]
magnetic_moments = [O => [1., 1.]]

Ecut  = 10  # Far too small to be converged
model = model_PBE(lattice, atoms, temperature=0.02, smearing=smearing=Smearing.Gaussian(),
                  magnetic_moments=magnetic_moments)
basis = PlaneWaveBasis(model, Ecut; kgrid=[1, 1, 1])

ρspin  = guess_spin_density(basis, magnetic_moments)
scfres = self_consistent_field(basis, tol=1e-2, ρspin=ρspin)
save_scfres("scfres.jld2", scfres);
n     Free energy       Eₙ-Eₙ₋₁     ρout-ρin   Magnet   Diag
---   ---------------   ---------   --------   ------   ----
  1   -27.63624854545         NaN   9.78e-01    0.001    5.0
  2   -28.49921786019   -8.63e-01   6.72e-01    0.901    6.0
  3   -28.90881088591   -4.10e-01   1.52e-01    1.273    4.0
  4   -28.93719495779   -2.84e-02   3.93e-02    1.732    3.0
  5   -28.93852280959   -1.33e-03   2.85e-02    1.931    2.0
scfres.energies
Energy breakdown:
    Kinetic             16.9097082
    AtomicLocal         -58.8025553
    AtomicNonlocal      4.7450553 
    Ewald               -4.8994689
    PspCorrection       0.0044178 
    Hartree             19.5263161
    Xc                  -6.4195592
    Entropy             -0.0024368

    total               -28.938522809592

The scfres.jld2 file could now be transfered to a different computer, Where one could fire up a REPL to inspect the results of the above calculation:

using DFTK
using JLD2
loaded = load_scfres("scfres.jld2")
propertynames(loaded)
(:ham, :basis, :energies, :converged, :ρ, :ρspin, :eigenvalues, :occupation, :εF, :n_iter, :n_ep_extra, :ψ, :diagonalization, :stage)
loaded.energies
Energy breakdown:
    Kinetic             16.9097082
    AtomicLocal         -58.8025553
    AtomicNonlocal      4.7450553 
    Ewald               -4.8994689
    PspCorrection       0.0044178 
    Hartree             19.5263161
    Xc                  -6.4195592
    Entropy             -0.0024368

    total               -28.938522809592

Since the loaded data contains exactly the same data as the scfres returned by the SCF calculation one could use it to plot a band structure, e.g. plot_bandstructure(load_scfres("scfres.jld2")) directly from the stored data.

Checkpointing of SCF calculations

A related feature, which is very useful especially for longer calculations with DFTK is automatic checkpointing, where the state of the SCF is periodically written to disk. The advantage is that in case the calculation errors or gets aborted due to overrunning the walltime limit one does not need to start from scratch, but can continue the calculation from the last checkpoint.

To enable automatic checkpointing in DFTK one needs to pass the ScfSaveCheckpoints callback to self_consistent_field, for example:

callback = DFTK.ScfSaveCheckpoints()
scfres = self_consistent_field(basis, tol=1e-2, ρspin=ρspin, callback=callback);

Notice that using this callback makes the SCF go silent since the passed callback parameter overwrites the default value (namely DefaultScfCallback()) which exactly gives the familiar printing of the SCF convergence. If you want to have both (printing and checkpointing) you need to chain both callbacks:

callback = DFTK.ScfDefaultCallback() ∘ DFTK.ScfSaveCheckpoints(keep=true)
scfres = self_consistent_field(basis, tol=1e-2, ρspin=ρspin, callback=callback);
n     Free energy       Eₙ-Eₙ₋₁     ρout-ρin   Magnet   Diag
---   ---------------   ---------   --------   ------   ----
  1   -27.63705536448         NaN   9.78e-01    0.001    5.0
  2   -28.49949648309   -8.62e-01   6.72e-01    0.904    6.0
  3   -28.90885843370   -4.09e-01   1.52e-01    1.276    4.0
  4   -28.93721441996   -2.84e-02   3.93e-02    1.734    3.0
  5   -28.93850596006   -1.29e-03   2.86e-02    1.931    2.0

For more details on using callbacks with DFTK's self_consistent_field function see Monitoring self-consistent field calculations.

By default checkpoint is saved in the file dftk_scf_checkpoint.jld2, which is deleted automatically once the SCF completes successfully. If one wants to keep the file one needs to specify keep=true as has been done in the ultimate SCF for demonstration purposes: now we can continue the previous calculation from the last checkpoint as if the SCF had been aborted. For this one just loads the checkpoint with load_scfres:

oldstate = load_scfres("dftk_scf_checkpoint.jld2")
scfres   = self_consistent_field(oldstate.basis, ρ=oldstate.ρ, ρspin=oldstate.ρspin,
                                 ψ=oldstate.ψ, tol=1e-3);
n     Free energy       Eₙ-Eₙ₋₁     ρout-ρin   Magnet   Diag
---   ---------------   ---------   --------   ------   ----
  1   -28.93886630430         NaN   2.64e-02    1.985    1.0
  2   -28.93940387331   -5.38e-04   1.39e-02    1.982    2.0
Availability of `load_scfres`, `save_scfres` and `ScfSaveCheckpoints`

As JLD2 is an optional dependency of DFTK these three functions are only available once one has both imported DFTK and JLD2 (using DFTK and using JLD2).

(Cleanup files generated by this notebook)

rm("dftk_scf_checkpoint.jld2")
rm("scfres.jld2")