Molly documentation

This documentation will first introduce the main features of the package with some examples, then will give details on each component of a simulation. There are further examples in the Molly examples section. For more information on specific types or functions, see the Molly API section or call ?function_name in Julia. The Differentiable simulation with Molly section describes taking gradients through simulations.

Molly takes a modular approach to molecular simulation. To run a simulation you create a Simulation object and call simulate! on it. The different components of the simulation can be used as defined by the package, or you can define your own versions. An important principle of the package is that your custom components, particularly force functions, should be easy to define and just as performant as the built-in versions.

Simulating a gas

Let's look at the simulation of a gas acting under the Lennard-Jones potential to start with. First, we'll need some atoms with the relevant parameters defined.

using Molly

n_atoms = 100
mass = 10.0u"u"
atoms = [Atom(mass=mass, σ=0.3u"nm", ϵ=0.2u"kJ * mol^-1") for i in 1:n_atoms]

See the Unitful.jl docs for more information on the unit annotations. Molly re-exports Unitful.jl and StaticArrays.jl since they are usually required to run simulations. Next, we'll need some starting coordinates and velocities.

box_size = 2.0u"nm"
coords = placeatoms(n_atoms, box_size, 0.3u"nm") # Random placement without clashing

temp = 100u"K"
velocities = [velocity(mass, temp) for i in 1:n_atoms]

We store the coordinates and velocities as static arrays for performance. They can be of any number of dimensions and of any number type, e.g. Float64 or Float32. Now we can define our general interactions, i.e. those between most or all atoms. Because we have defined the relevant parameters for the atoms, we can use the built-in Lennard Jones type.

general_inters = (LennardJones(),)

Finally, we can define and run the simulation. We use an Andersen thermostat to keep a constant temperature, and we log the temperature and coordinates every 10 steps. Periodic boundary conditions are used in a cubic box.

s = Simulation(
    simulator=VelocityVerlet(), # Use velocity Verlet integration
    atoms=atoms,
    general_inters=general_inters,
    coords=coords,
    velocities=velocities,
    temperature=temp,
    box_size=box_size,
    thermostat=AndersenThermostat(1.0u"ps"),
    loggers=Dict("temp" => TemperatureLogger(10),
                    "coords" => CoordinateLogger(10)),
    timestep=0.002u"ps",
    n_steps=1_000,
)

simulate!(s)

By default the simulation is run in parallel on the number of threads available to Julia, but this can be turned off by giving the keyword argument parallel=false to simulate!. An animation of the stored coordinates using can be saved using visualize, which is available when Makie.jl is imported.

using Makie
visualize(s.loggers["coords"], box_size, "sim_lj.gif")

LJ simulation

GPU acceleration

To run simulations on the GPU you will need to have a CUDA-compatible device and to have CUDA.jl installed. Simulation setup is similar to above, but with the coordinates, velocities and atoms moved to the GPU. Currently, running on the GPU requires using the AtomMin type, which is isbits, or another isbits type for the atoms. Neighbor lists and thermostats are not currently implemented for the GPU. This example also shows setting up a simulation to run with Float32, which is a good idea for GPUs.

using Molly
using CUDA

n_atoms = 100
mass = 10.0f0u"u"
box_size = 2.0f0u"nm"
temp = 100.0f0u"K"
atoms = cu([AtomMin(mass=mass, σ=0.3f0u"nm", ϵ=0.2f0u"kJ * mol^-1") for i in 1:n_atoms])
coords = cu(placeatoms(n_atoms, box_size, 0.3u"nm"))
velocities = cu([velocity(mass, temp) for i in 1:n_atoms])

s = Simulation(
    simulator=VelocityVerlet(),
    atoms=atoms,
    general_inters=(LennardJones(),),
    coords=coords,
    velocities=velocities,
    temperature=temp,
    box_size=box_size,
    thermostat=NoThermostat(),
    loggers=Dict("temp" => TemperatureLogger(typeof(1.0f0u"K"), 10),
                    "coords" => CoordinateLogger(typeof(box_size), 10)),
    timestep=0.002f0u"ps",
    n_steps=1_000,
)

simulate!(s)

Simulating diatomic molecules

If we want to define specific interactions between atoms, for example bonds, we can do. Using the same definitions as the first example, let's set up the coordinates so that paired atoms are 1 Å apart.

coords = placeatoms(n_atoms ÷ 2, box_size, 0.3u"nm")
for i in 1:length(coords)
    push!(coords, coords[i] .+ [0.1, 0.0, 0.0]u"nm")
end

velocities = [velocity(mass, temp) for i in 1:n_atoms]

Now we can use the built-in bond type to place a harmonic constraint between paired atoms. The arguments are the indices of the two atoms in the bond, the equilibrium distance and the force constant.

bonds = [HarmonicBond(i=i, j=(i + n_atoms ÷ 2), b0=0.1u"nm", kb=300_000.0u"kJ * mol^-1 * nm^-2") for i in 1:(n_atoms ÷ 2)]

specific_inter_lists = (bonds,)

This time, we are also going to use a neighbor list to speed up the Lennard Jones calculation. We can use the built-in distance neighbor finder. The arguments are a 2D array of eligible interactions, the number of steps between each update and the cutoff to be classed as a neighbor.

# All pairs apart from bonded pairs are eligible to interact
nb_matrix = trues(n_atoms, n_atoms)
for i in 1:(n_atoms ÷ 2)
    nb_matrix[i, i + (n_atoms ÷ 2)] = false
    nb_matrix[i + (n_atoms ÷ 2), i] = false
end

neighbor_finder = DistanceNeighborFinder(nb_matrix, 10, 1.5u"nm")

Now we can simulate as before.

s = Simulation(
    simulator=VelocityVerlet(),
    atoms=atoms,
    specific_inter_lists=specific_inter_lists,
    general_inters=(LennardJones(nl_only=true),),
    coords=coords,
    velocities=velocities,
    temperature=temp,
    box_size=box_size,
    neighbor_finder=neighbor_finder,
    thermostat=AndersenThermostat(1.0u"ps"),
    loggers=Dict("temp" => TemperatureLogger(10),
                    "coords" => CoordinateLogger(10)),
    timestep=0.002u"ps",
    n_steps=1_000,
)

simulate!(s)

This time when we view the trajectory we can add lines to show the bonds.

visualize(s.loggers["coords"], box_size, "sim_diatomic.gif",
            connections=[(i, i + (n_atoms ÷ 2)) for i in 1:(n_atoms ÷ 2)],
            markersize=0.05, linewidth=5.0)

Diatomic simulation

Simulating gravity

Molly is geared primarily to molecular simulation, but can also be used to simulate other physical systems. Let's set up a gravitational simulation. This example also shows the use of Float32, a 2D simulation and no specified units.

atoms = [Atom(mass=1.0f0), Atom(mass=1.0f0)]
coords = [SVector(0.3f0, 0.5f0), SVector(0.7f0, 0.5f0)]
velocities = [SVector(0.0f0, 1.0f0), SVector(0.0f0, -1.0f0)]
general_inters = (Gravity(nl_only=false, G=1.5f0),)

s = Simulation(
    simulator=VelocityVerlet(),
    atoms=atoms,
    general_inters=general_inters,
    coords=coords,
    velocities=velocities,
    box_size=1.0f0,
    loggers=Dict("coords" => CoordinateLogger(Float32, 10; dims=2)),
    timestep=0.002f0,
    n_steps=2_000,
    force_unit=NoUnits,
    energy_unit=NoUnits,
)

simulate!(s)

When we view the simulation we can use some extra options:

visualize(s.loggers["coords"], 1.0f0, "sim_gravity.gif",
            trails=4, framerate=15, color=[:orange, :lightgreen],
            markersize=0.05)

Gravity simulation

Simulating a protein

Molly has a rudimentary parser of Gromacs topology and coordinate files. Data for a protein can be read into the same data structures as above and simulated in the same way. Currently, the OPLS-AA forcefield is implemented. Here a StructureWriter is used to write the trajectory as a PDB file.

atoms, specific_inter_lists, general_inters, nb_matrix, coords, box_size = readinputs(
            joinpath(dirname(pathof(Molly)), "..", "data", "5XER", "gmx_top_ff.top"),
            joinpath(dirname(pathof(Molly)), "..", "data", "5XER", "gmx_coords.gro"))

temp = 298u"K"

s = Simulation(
    simulator=VelocityVerlet(),
    atoms=atoms,
    specific_inter_lists=specific_inter_lists,
    general_inters=general_inters,
    coords=coords,
    velocities=[velocity(a.mass, temp) for a in atoms],
    temperature=temp,
    box_size=box_size,
    neighbor_finder=DistanceNeighborFinder(nb_matrix, 10, 1.5u"nm"),
    thermostat=AndersenThermostat(1.0u"ps"),
    loggers=Dict("temp" => TemperatureLogger(10),
                    "writer" => StructureWriter(10, "traj_5XER_1ps.pdb")),
    timestep=0.0002u"ps",
    n_steps=5_000,
)

simulate!(s)

Agent-based modelling

Agent-based modelling (ABM) is conceptually similar to molecular dynamics. Julia has Agents.jl for ABM, but Molly can also be used to simulate arbitrary agent-based systems in continuous space. Here we simulate a toy SIR model for disease spread. This example shows how atom properties can be mutable, i.e. change during the simulation, and includes custom forces and loggers (see below for more).

@enum Status susceptible infected recovered

# Custom atom type
mutable struct Person
    i::Int64
    status::Status
    mass::Float64
    σ::Float64
    ϵ::Float64
end

Molly.mass(person::Person) = person.mass

# Custom GeneralInteraction
struct SIRInteraction <: GeneralInteraction
    nl_only::Bool
    dist_infection::Float64
    prob_infection::Float64
    prob_recovery::Float64
end

# Custom force function
function Molly.force(inter::SIRInteraction,
                        coord_i,
                        coord_j,
                        atom_i,
                        atom_j,
                        box_size)
    if (atom_i.status == infected && atom_j.status == susceptible) ||
                (atom_i.status == susceptible && atom_j.status == infected)
        # Infect close people randomly
        dr = vector(coord_i, coord_j, box_size)
        r2 = sum(abs2, dr)
        if r2 < inter.dist_infection ^ 2 && rand() < inter.prob_infection
            atom_i.status = infected
            atom_j.status = infected
        end
    end
    # Workaround to obtain a self-interaction
    if atom_i.i == (atom_j.i + 1)
        # Recover randomly
        if atom_i.status == infected && rand() < inter.prob_recovery
            atom_i.status = recovered
        end
    end
    return zero(coord_i)
end

# Custom Logger
struct SIRLogger <: Logger
    n_steps::Int
    fracs_sir::Vector{Vector{Float64}}
end

# Custom logging function
function Molly.log_property!(logger::SIRLogger, s::Simulation, step_n::Integer)
    if step_n % logger.n_steps == 0
        counts_sir = [
            count(p -> p.status == susceptible, s.atoms),
            count(p -> p.status == infected   , s.atoms),
            count(p -> p.status == recovered  , s.atoms)
        ]
        push!(logger.fracs_sir, counts_sir ./ length(s.atoms))
    end
end

temp = 0.01
timestep = 0.02
box_size = 10.0
n_steps = 1_000
n_people = 500
n_starting = 2
atoms = [Person(i, i <= n_starting ? infected : susceptible, 1.0, 0.1, 0.02) for i in 1:n_people]
coords = placeatoms(n_people, box_size, 0.1; dims=2)
velocities = [velocity(1.0, temp; dims=2) for i in 1:n_people]
general_inters = (LennardJones = LennardJones(nl_only=true), SIR = SIRInteraction(false, 0.5, 0.06, 0.01))

s = Simulation(
    simulator=VelocityVerlet(),
    atoms=atoms,
    general_inters=general_inters,
    coords=coords,
    velocities=velocities,
    temperature=temp,
    box_size=box_size,
    neighbor_finder=DistanceNeighborFinder(trues(n_people, n_people), 10, 2.0),
    thermostat=AndersenThermostat(5.0),
    loggers=Dict("coords" => CoordinateLogger(Float64, 10; dims=2),
                    "SIR" => SIRLogger(10, [])),
    timestep=timestep,
    n_steps=n_steps,
    force_unit=NoUnits,
    energy_unit=NoUnits,
)

simulate!(s)

visualize(s.loggers["coords"], box_size, "sim_agent.gif")

Agent simulation

We can use the logger to plot the fraction of people susceptible (blue), infected (orange) and recovered (green) over the course of the simulation: Fraction SIR

Units

Molly is fairly opinionated about using Unitful.jl units as shown above: you don't have to use them, but it is better if you do. Whilst you occasionally may run into friction with dimension mismatches, using units has the major advantages of catching whole classes of errors and letting you physically interpret the numbers in your system. The performance overhead of using units is minimal.

All your interaction types need to return the same units of force and energy or the simulation will not run. By default these are kJ * mol^-1 * nm^-1 for force and kJ * mol^-1 for energy, but this can be changed using the force_unit and energy_unit arguments to Simulation. If you need to strip units for downstream analysis, use the ustrip function.

Forces

Forces define how different parts of the system interact. The force on each particle in the system is derived from the potential corresponding to the interaction.

\[\vec{F}_i = -\sum_j \frac{dV_{ij}(r_{ij})}{dr_{ij}}\frac{\vec{r}_{ij}}{r_{ij}}\]

In Molly there are two types of interactions. GeneralInteractions are present between all or most atoms, and account for example for non-bonded terms. SpecificInteractions are present between specific atoms, and account for example for bonded terms.

The available general interactions are:

The available specific interactions are:

To define your own GeneralInteraction, first define the struct:

struct MyGeneralInter <: GeneralInteraction
    nl_only::Bool
    # Any other properties, e.g. constants for the interaction or cutoff parameters
end

The nl_only property is required and determines whether the neighbor list is used to omit distant atoms (true) or whether all atom pairs are always considered (false). Next, you need to define the force function acting between a pair of atoms. This has a set series of arguments. For example:

function Molly.force(inter::MyGeneralInter,
                        coord_i,
                        coord_j,
                        atom_i,
                        atom_j,
                        box_size)
    dr = vector(coord_i, coord_j, box_size)

    # Replace this with your force calculation
    # A positive force causes the atoms to move apart
    f = 0.0

    # Obtain a vector for the force
    fdr = f * normalize(dr)
    return fdr
end

If you need to obtain the vector from atom i to atom j, use the vector function. This gets the vector between the closest images of atoms i and j accounting for the periodic boundary conditions. Atom properties can be accessed, e.g. atom_i.σ. Typically the force function is where most computation time is spent during the simulation, so consider optimising this function if you want high performance.

To use your custom force, add it to the list of general interactions:

general_inters = (MyGeneralInter(true),)

Then create and run a Simulation as above. Note that you can also use named tuples instead of tuples if you want to access interactions by name:

general_inters = (MyGeneralInter = MyGeneralInter(true),)

For performance reasons it is best to avoid containers with abstract type parameters, such as Vector{GeneralInteraction}.

To define your own SpecificInteraction, first define the struct:

struct MySpecificInter <: SpecificInteraction
    # Any number of atoms involved in the interaction
    i::Int
    j::Int
    # Any other properties, e.g. a bond distance corresponding to the energy minimum
end

Next, you need to define the force function. For example:

function Molly.force(inter::MySpecificInter, coords, s::Simulation)
    dr = vector(coords[inter.i], coords[inter.j], s.box_size)

    # Replace this with your force calculation
    # A positive force causes the atoms to move apart
    f = 0.0

    fdr = f * normalize(dr)
    return [inter.i, inter.j], [-fdr, fdr]
end

The return values are a list of the atom indices and a list of the vector forces. The example here is between two atoms but can be adapted for any number of atoms. To use your custom force, add it to the specific interaction lists:

specific_inter_lists = ([MySpecificInter(1, 2), MySpecificInter(3, 4)],)

Specific interactions are always run on the CPU (with the results moved to the GPU if required), which is why we can index into coords and access s without harming performance.

Cutoffs

The total potential energy of a system is given as a sum of the individual inter-particle potentials

\[V(\vec{r}_1, \dotsc, \vec{r}_N) = \sum_{i<j}V_{ij}(r_{ij})\]

The forces acting on the particles are given by

\[\vec{F}_i = -\sum_j \frac{dV_{ij}(r_{ij})}{dr_{ij}}\frac{\vec{r}_{ij}}{r_{ij}}\]

In the case of the Lennard-Jones potential, the inter-particle potential is given by

\[V_{ij}(r_{ij}) = 4\varepsilon_{ij} \left[\left(\frac{\sigma_{ij}}{r_{ij}}\right)^{12} - \left(\frac{\sigma_{ij}}{r_{ij}}\right)^{6}\right]\]

and the forces are given by

\[\begin{aligned} \vec{F}_i &= 24\varepsilon_{ij} \left(2\frac{\sigma_{ij}^{12}}{r_{ij}^{13}} - \frac{\sigma_{ij}^6}{r_{ij}^{7}}\right) \frac{\vec{r}_{ij}}{r_{ij}} \\ &= \frac{24\varepsilon_{ij}}{r_{ij}^2} \left[2\left(\frac{\sigma_{ij}^{6}}{r_{ij}^{6}}\right)^2 -\left(\frac{\sigma_{ij}}{r_{ij}}\right)^{6}\right] \vec{r}_{ij} \end{aligned}\]

As the potential, and thus also the force decreases rapidly with the distance, in almost every implementation of the Lennard-Jones force calculation there is a cutoff radius beyond which the force is set to 0.

While this sounds like a very sensible approach, it introduces a discontinuity in the force function and it requires us to also modify the potential, as beyond the cutoff radius the force would be 0, but the derivative of the unmodified potential is not. One way to truncate the potential is to shift the potential by its cutoff value.

\[\begin{aligned} \vec{F}_{SP}(\vec{r}) &= \begin{cases} \vec{F}(\vec{r}), r < r_c \\ 0, r > r_c \end{cases} \\ V_{SP}(r) &= \begin{cases} V(r) - V(r_c), r \le r_c \\ 0, r > r_c \end{cases} \end{aligned}\]

This way the potential function is continuous and the relation between forces and potentials is satisfied. This truncation method is called shifted potential cutoff.

Another option is to shift the force in order to make it continuous

\[\begin{aligned} F_{SF}(r) &= \begin{cases} F(r) - F(r_c), r \le r_c \\ 0, r > r_c \end{cases} \\ V_{SF}(r) &= \begin{cases} V(r) - (r-r_c) V'(r_c) - V(r_c), r \le r_c \\ 0, r > r_c \end{cases} \end{aligned}\]

This requires a more complicated change in the potential in order to satisfy the relation between them. This method is called the shifted force cutoff. The continuity of the force is desirable as it may give better energy conservation properties as shown in Toxvaerd 2011.

There are also more complicated truncation methods that interpolate between the original potential and 0, but we will consider those two for the moment.

The truncation approximations that we use can significantly alter the qualitative features of the simulation as shown in many articles in the molecular dynamics literature (Fitzner 2017, van der Spoel 2006 and others).

Implementation

Since the truncation algorithm is independent of the interaction for which is used, each interaction is defined without including cutoffs. The corresponding interaction struct has a cutoff field which is then used via dispatch to apply the chosen cutoff.

Simulators

Simulators define what type of simulation is run. This could be anything from a simple energy minimisation to complicated replica exchange MD. The available simulators are:

To define your own Simulator, first define the struct:

struct MySimulator <: Simulator
    # Any properties, e.g. an implicit solvent friction constant
end

Then, define the function that carries out the simulation. This example shows some of the helper functions you can use:

function Molly.simulate!(s::Simulation,
                            simulator::MySimulator,
                            n_steps::Integer;
                            parallel::Bool=true)
    # Find neighbors like this
    neighbors = find_neighbors!(s, nothing, s.neighbor_finder, 0;
                                    parallel=parallel)

    # Show a progress bar like this, if you have imported ProgressMeter
    @showprogress for step_n in 1:n_steps
        # Apply the loggers like this
        for logger in values(s.loggers)
            log_property!(logger, s, step_n)
        end

        # Calculate accelerations like this
        accels_t = accelerations(s, neighbors; parallel=parallel)

        # Ensure coordinates stay within the simulation box like this
        for i in 1:length(s.coords)
            s.coords[i] = adjust_bounds.(s.coords[i], s.box_size)
        end

        # Apply the thermostat like this
        apply_thermostat!(s, s.thermostat)

        # Find new neighbors like this
        neighbors = find_neighbors!(s, neighbors, s.neighbor_finder, step_n;
                                        parallel=parallel)

        # Increment the step counter like this
        s.n_steps_made[1] += 1
    end
    return s
end

To use your custom simulator, give it as the simulator argument when creating the Simulation.

Under the hood there are two implementations of common simulators: an in-place version geared towards CPUs, and an out-of-place version geared towards GPUs and differentiable simulation. You can define different versions of a simulator for in-place and out-of-place simulations by dispatching on Simulation{false} or Simulation{true} respectively. This also applies to thermostats and neighbor lists. You do not have to define two versions though: you may only intend to use the simulation one way, or the out-of-place version may be performant in all cases. The above example is more similar to the in-place version; see the source code for an example of the out-of-place version.

The implementation to use is guessed when you call Simulation based on whether coords is a CuArray, but can be given explicitly with the gpu_diff_safe argument.

Thermostats

Thermostats control the temperature over a simulation. The available thermostats are:

AndersenThermostat.

To define your own Thermostat, first define the struct:

struct MyThermostat <: Thermostat
    # Any properties, e.g. a coupling constant
end

Then, define the function that implements the thermostat every timestep:

function apply_thermostat!(s::Simulation, thermostat::MyThermostat)
    # Do something to the simulation, e.g. scale the velocities
    return s
end

The functions velocity, maxwellboltzmann and temperature may be useful here. To use your custom thermostat, give it as the thermostat argument when creating the Simulation.

Neighbor finders

Neighbor finders find close atoms periodically throughout the simulation, saving on computation time by allowing the force calculation between distant atoms to be omitted. The available neighbor finders are:

To define your own NeighborFinder, first define the struct:

struct MyNeighborFinder <: NeighborFinder
    nb_matrix::BitArray{2}
    n_steps::Int
    # Any other properties, e.g. a distance cutoff
end

Examples of two useful properties are given here: a matrix indicating atom pairs eligible for non-bonded interactions, and a value determining how many timesteps occur between each evaluation of the neighbor finder. Then, define the neighbor finding function that is called every step by the simulator:

function find_neighbors!(s::Simulation,
                            current_neighbors,
                            nf::MyNeighborFinder,
                            step_n::Integer;
                            parallel::Bool=true)
    if step_n % nf.n_steps == 0
        neighbors = Tuple{Int, Int}[]
        # Add to neighbors
        return neighbors
    else
        return current_neighbors
    end
end

To use your custom neighbor finder, give it as the neighbor_finder argument when creating the Simulation.

Loggers

Loggers record properties of the simulation to allow monitoring and analysis. The available loggers are:

To define your own Logger, first define the struct:

struct MyLogger <: Logger
    n_steps::Int
    # Any other properties, e.g. an Array to record values during the trajectory
end

Then, define the logging function that is called every step by the simulator:

function Molly.log_property!(logger::MyLogger, s::Simulation, step_n::Integer)
    if step_n % logger.n_steps == 0
        # Record some property or carry out some action
    end
end

The use of n_steps is optional and is an example of how to record a property every n steps through the simulation. To use your custom logger, add it to the dictionary of loggers:

loggers = Dict("mylogger" => MyLogger(10))

Analysis

Molly contains some tools for analysing the results of simulations. The available analysis functions are:

Julia is a language well-suited to implementing all kinds of analysis for molecular simulations.