Shear Viscosity of a Lennard-Jones Liquid Near the Triple Point (LAMMPS)

This example shows how to calculate viscosity of argon from pressure tensor data obtained from LAMMPS MD simulations. The required theoretical background is explained the Shear Viscosity section. The same simulations are also used for the bulk viscosity and thermal conductivity examples in the following two notebooks. The goal of the argon examples is to derive the three transport properties with a relative error smaller than those found in the literature.

All argon MD simulations use the Lennard-Jones potential with reduced Lennard-Jones units. For example, the reduced unit of viscosity is denoted as η*, and the reduced unit of time as τ*. The simulated system consists of 1372 argon atoms. The thermodynamic state \(\rho=0.8442\,\mathrm{\rho}^*\) and \(T=0.722\,\mathrm{T}^*\) corresponds to a liquid phase near the triple point (\(\rho=0.0845\,\mathrm{\rho}^*\) and \(T=0.69\,\mathrm{T}^*\)). This liquid state is known to exhibit slow relaxation times, which complicates the convergence of transport properties and makes it a popular test case for computational methods.

The LAMMPS input files can be found in the directory docs/data/lammps_lj3d of STACIE’s Git source repository. To obtain sufficient data for all three properties, we performed 100 independent runs, for which the guesstimated relative error is tabulated below. The Lorentz model is used to fit the spectrum, with degrees \(S_\text{num}=\{0, 2\}\) and \(S_\text{den}=\{2\}\), corresponding to \(P=3\) parameters.

Property

\(M\)

Guess rel. error

Shear viscosity

500

0.5 %

Bulk viscosity

100

1.2 %

Thermal conductivity

300

0.7 %

The (initial) settings for the production runs were determined as follows. In general, the integration time step in MD simulations roughly corresponds to one tenth of a period of the fastest oscillations in the system. At shorter time scales than 10 steps, the dynamics is most likely irrelevant for transport properties. Hence, in our first simulations, all data was recorded with block averages of 10 steps. As mentioned in the section on the block averages, at least \(400 P\) blocks are recommended. The initial production runs therefore consisted of 12000 MD steps. Note that these values are only coarse estimates. As explained below, the production runs were extended twice to improve the statistics.

Details of the MD simulations can be found the LAMMPS inputs docs/data/lammps_lj3d/template-init.lammps and docs/data/lammps_lj3d/template-ext.lammps in STACIE’s Git repository. These input files are actually Jinja2 templates that are rendered with different random seeds (and restart files) for each run. The initial production simulations start from an FCC crystal structure, which is first melted for 5000 steps at an elevated temperature of \(T=1.5\,\mathrm{T}^*\) in the NVT ensemble. The system is then equilibrated at the desired temperature of \(T=0.722\,\mathrm{T}^*\) for 5000 additional steps. Starting from the equilibrated states, production runs were performed in the NVE ensemble. The velocities are not rescaled after the NVT equilibration, to ensure that the set of NVE runs as a whole is representative of the NVT ensemble. During the production phase, trajectory data is collected with block averages over 10 steps.

The LAMMPS input files contain commands to write output files that can be directly loaded using Python and NumPy without any additional converters or wrappers. The following output files from docs/data/lammps_lj3d/sims/replica_????_part_??/ were used for the analysis:

  • info.yaml: simulation settings that may be useful for post-processing.

  • nve_thermo.txt: subsampled instantaneous temperature and related quantities

  • nve_pressure_blav.txt: block-averaged (off)diagonal pressure tensor components

  • nve_heatflux_blav.txt: block-averaged \(x\), \(y\), and \(z\) components of the heat flux vector, i.e. \(J^\text{h}_x\), \(J^\text{h}_y\), and \(J^\text{h}_z\). Heat fluxes are used in the thermal conductivity example, not in this notebook.

Note

The results in this example were obtained using LAMMPS 29 Aug 2024 Update 3. Minor differences may arise when using a different version of LAMMPS, or even the same version compiled with a different compiler.

Library Imports and Configuration

import os
import numpy as np
import matplotlib.pyplot as plt
import matplotlib as mpl
from path import Path
from yaml import safe_load
from scipy.stats import chi2
from stacie import (
    UnitConfig,
    compute_spectrum,
    estimate_acint,
    LorentzModel,
    plot_fitted_spectrum,
    plot_extras,
)
from utils import plot_instantaneous_percentiles, plot_cumulative_temperature_histogram
mpl.rc_file("matplotlibrc")
%config InlineBackend.figure_formats = ["svg"]
# You normally do not need to change this path.
# It only needs to be overridden when building the documentation.
DATA_ROOT = Path(os.getenv("DATA_ROOT", "./")) / "lammps_lj3d/sims/"

Analysis of the Equilibration Runs

To ensure that the production runs start from a well-equilibrated state, we first analyze the equilibration runs. The following code cell plots percentiles of the instantaneous temperature as a function of time over all independent runs. For reference, the theoretical percentiles of the NVT ensemble are shown as horizontal dotted lines.

def plot_equilibration(ntraj: int = 100):
    """Plot percentiles of the instantaneous temperature."""
    # Load the configuration from the YAML file.
    with open(DATA_ROOT / "replica_0000_part_00/info.yaml") as fh:
        info = safe_load(fh)
    temp_d = info["temperature"]
    ndof = info["natom"] * 3 - 3

    # Load trajectory data.
    temps = []
    time = None
    for itraj in range(ntraj):
        equil_dir = DATA_ROOT / f"replica_{itraj:04d}_part_00/"
        data = np.loadtxt(equil_dir / "nvt_thermo.txt")
        temps.append(data[:, 1])
        if time is None:
            time = info["block_size"] * info["timestep"] * np.arange(len(data))
    temps = np.array(temps)

    # Select the last part (final temperature), discarding the melting phase.
    temps = temps[:, 550:]
    time = time[550:]

    # Plot the instantaneous and desired temperature.
    plt.close("tempequil")
    _, ax = plt.subplots(num="tempequil")
    percents = [95, 80, 50, 20, 5]
    plot_instantaneous_percentiles(
        ax,
        time,
        temps,
        percents,
        expected=[chi2.ppf(percent / 100, ndof) * temp_d / ndof for percent in percents],
    )
    ax.set_title(
        "Percentiles of the instantaneous temperature over the 100 equilibration runs"
    )
    ax.set_ylabel("Temperature")
    ax.set_xlabel("Time")


plot_equilibration()
../_images/d5431f6d4fe67c1889e3cef43dd8fe66267320c1b956b8679748c547b53d47dc.svg

The plot shows that the equilibration runs were successful: They reach the correct average temperature and also exhibit the expected fluctuations. Note that we used a Langevin thermestat for equilibration. This is a robust local thermostat that quickly brings all degrees of freedom to the desired temperature. In comparison, a global Nosé-Hoover-chain (NHC) thermostat would still show large oscillations in the temperature, even after 5000 steps. Taking the last state from an NHC run generally results in biased initial conditions for the NVE runs. (You can see the difference by modifying the LAMMPS input files, rerunning them and then rerunning this notebook.)

Analysis of the Initial Production Simulations

The following code cell defines analysis functions:

  • get_indep_paniso transforms the pressure tensor components into five independent anisotropic contributions, as explained in the Shear Viscosity theory section.

  • estimate_viscosity calculates the viscosity and plots the results. It also prints recommendations for data reduction (block averaging) and simulation time, as explained in the following two sections of the documentation:

    These will be used to determine whether our initial simulation settings are appropriate.

def get_indep_paniso(pcomps):
    return np.array(
        [
            (pcomps[0] - 0.5 * pcomps[1] - 0.5 * pcomps[2]) / np.sqrt(3),
            0.5 * pcomps[1] - 0.5 * pcomps[2],
            pcomps[3],
            pcomps[4],
            pcomps[5],
        ]
    )


def estimate_viscosity(name, pcomps, av_temperature, volume, timestep, verbose=True):
    # Create the spectrum of the pressure fluctuations.
    # Note that the Boltzmann constant is 1 in reduced LJ units.
    uc = UnitConfig(
        acint_fmt=".3f",
        acint_symbol="η",
        acint_unit_str="η*",
        freq_unit_str="1/τ*",
        time_fmt=".3f",
        time_unit_str="τ*",
    )
    spectrum = compute_spectrum(
        pcomps,
        prefactors=volume / av_temperature,
        timestep=timestep,
    )

    # Estimate the viscosity from the spectrum.
    result = estimate_acint(spectrum, LorentzModel(), verbose=verbose, uc=uc)

    if verbose:
        # Plot some basic analysis figures.
        plt.close(f"{name}_spectrum")
        _, ax = plt.subplots(num=f"{name}_spectrum")
        plot_fitted_spectrum(ax, uc, result)
        plt.close(f"{name}_extras")
        _, axs = plt.subplots(2, 2, num=f"{name}_extras")
        plot_extras(axs, uc, result)

    # Return the viscosity
    return result

The next cell performs the analysis of the initial simulations. It prints the recommended block size and the simulation time for the production runs, and then generates two figures:

  • The spectrum of the off-diagonal pressure fluctuations, and the model fitted to the spectrum.

  • Additional intermediate results.

def analyze_production(npart: int, ntraj: int = 100, select: int | None = None):
    """
    Perform the analysis of the production runs.

    Parameters
    ----------
    npart
        Number of parts in the production runs.
        For the initial production runs, this is 1.
    ntraj
        Number of trajectories in the production runs.
    select
        If `None`, all anisotropic contributions are selected.
        If not `None`, only select the given anisotropic contribution
        for the viscosity estimate. Must be one of `0`, `1`, `2`, `3`, `4`, `None`.

    Returns
    -------
    result
        The result from STACIE's `estimate_acint` function,
    """
    # Load the configuration from the YAML file.
    with open(DATA_ROOT / "replica_0000_part_00/info.yaml") as fh:
        info = safe_load(fh)

    # Load trajectory data.
    thermos = []
    pcomps_full = []
    for itraj in range(ntraj):
        thermos.append([])
        pcomps_full.append([])
        for ipart in range(npart):
            prod_dir = DATA_ROOT / f"replica_{itraj:04d}_part_{ipart:02d}/"
            thermos[-1].append(np.loadtxt(prod_dir / "nve_thermo.txt"))
            pcomps_full[-1].append(np.loadtxt(prod_dir / "nve_pressure_blav.txt"))
    thermos = [np.concatenate(parts).T for parts in thermos]
    pcomps_full = [np.concatenate(parts).T for parts in pcomps_full]
    av_temperature = np.mean([thermo[1] for thermo in thermos])

    # Compute the viscosity.
    pcomps_aniso = np.concatenate([get_indep_paniso(p[1:]) for p in pcomps_full])
    if select is not None:
        if select < 0 or select > 4:
            raise ValueError(f"Invalid selection {select}, must be in [0, 4]")
        pcomps_aniso = pcomps_aniso[select::5]
    return estimate_viscosity(
        f"part{npart}",
        pcomps_aniso,
        av_temperature,
        info["volume"],
        info["timestep"] * info["block_size"],
        verbose=select is None,
    )


eta_production_init = analyze_production(1).acint
CUTOFF FREQUENCY SCAN cv2l(125%)
     neff   criterion  fcut [1/τ*]
---------  ----------  ----------
     15.0         inf    3.93e-01  (Variance of the correlation time estimate is too large.)
     15.9         inf    4.18e-01  (Variance of the correlation time estimate is too large.)
     16.9         inf    4.45e-01  (Variance of the correlation time estimate is too large.)
     18.0         inf    4.73e-01  (Variance of the correlation time estimate is too large.)
     19.1         inf    5.04e-01  (Variance of the correlation time estimate is too large.)
     20.3         inf    5.36e-01  (Variance of the correlation time estimate is too large.)
     21.6         inf    5.71e-01  (Variance of the correlation time estimate is too large.)
     23.0         inf    6.08e-01  (Variance of the correlation time estimate is too large.)
     24.4         inf    6.47e-01  (Variance of the correlation time estimate is too large.)
     25.9         inf    6.89e-01  (Variance of the correlation time estimate is too large.)
     27.6         inf    7.33e-01  (Variance of the correlation time estimate is too large.)
     29.3         8.0    7.81e-01
     31.2        11.4    8.31e-01
     33.2        16.1    8.85e-01
     35.3        22.0    9.42e-01
     37.5        29.1    1.00e+00
     39.9        37.1    1.07e+00
     42.4        45.7    1.14e+00
     45.1        54.2    1.21e+00
     48.0        63.4    1.29e+00
     51.1        73.3    1.37e+00
     54.4        85.0    1.46e+00
     57.8        99.6    1.55e+00
     61.5       118.2    1.65e+00
Cutoff criterion exceeds incumbent + margin: 8.0 + 100.0.

INPUT TIME SERIES
    Time step:                     0.030 τ*
    Simulation time:               36.000 τ*
    Maximum degrees of freedom:    1000.0

MAIN RESULTS
    Autocorrelation integral:      3.205 ± 0.063 η*
    Integrated correlation time:   0.138 ± 0.003 τ*

SANITY CHECKS (weighted averages over cutoff grid)
    Effective number of points:    29.4 (ideally > 60)
    Regression cost Z-score:       -0.6 (ideally < 2)
    Cutoff criterion Z-score:      6.3 (ideally < 2)

MODEL lorentz(0.1) | CUTOFF CRITERION cv2l(125%)
    Number of parameters:          3
    Average cutoff frequency:      7.82e-01 1/τ*
    Exponential correlation time:  0.377 ± 0.036 τ*

RECOMMENDED SIMULATION SETTINGS (EXPONENTIAL CORR. TIME)
    Block time:                    0.118 ± 0.020 τ*
    Simulation time:               23.689 ± 0.285 τ*
../_images/ac282301ec9d121ff0dd7d31ff94881123f3eb66c15fcb9ad90b794761af3673.svg ../_images/6e77777b5ae446564a00386c3be7799ad52d31de0da49457726512ccdb8b1ea9.svg

Several things can be observed in the analysis of the initial production runs:

  • The recommendations based on the exponential correlation time were met by the initial simulation settings.

    • The recommended simulation time is 24 τ*, which about 8000 steps. The initial production runs (12000 steps) were therefore sufficient.

    • The recommended block time is 0.118 τ*, which corresponds to about 40 steps. The block size used in the initial production runs (10 steps) was sufficiently small.

  • The relative error of the viscosity estimate is about 2%, which is larger than the guesstimated value 0.5%. This is fine and somewhat expected, since this guess is known to be crude.

  • The Lorentz model used to fit the spectrum was a fair choice, but for higher frequencies, the sampling PSD clearly decays faster than the fitted model. For the case of viscosity, there is (to the best of our knowledge) no solid theoretical argument to support the exponential decay of the ACF of the pressure tensor. It just seems to be a reasonable choice for this case.

  • The effective number of points fitted to the spectrum is 29.4, which is low for a 3 parameter model. For high-quality production simulations, it would be good to triple the simulation length, as to multiply the resolution of the frequency grid by 3. This is hopefully sufficient to reach 60 effective points.

As can be seen in the comparison to literature results below, the results for the initial production runs were already quite good. However, for the sake of demonstration, the production runs were extended by an additional 24000 steps each, to triple the simulation time. This revealed that the effective number of points fitted to the spectrum increase to 61, which is a sublinear increase, just enough to reach the target of 60. For the sake of demonstration, we decided to extend the production runs by another 64000 steps, which resulted in a total simulation time of 300 τ* per run.

The difficulty of increasing the effective number of fitted points can be understood as follows. The Lorentz model is not capable of fitting the spectrum to higher frequencies. By including more data points, the limitations of the approximating model also become clearer, and the cutoff criterion will detect some underfitting (and thus risk for bias) at lower cutoffs.

Analysis of the Production Simulations

Here we just repeat the analysis, but now with extended production runs.

eta_production_ext = analyze_production(3).acint
CUTOFF FREQUENCY SCAN cv2l(125%)
     neff   criterion  fcut [1/τ*]
---------  ----------  ----------
     15.0         inf    4.71e-02  (Variance of the correlation time estimate is too large.)
     15.9         inf    5.01e-02  (Variance of the correlation time estimate is too large.)
     16.9         inf    5.34e-02  (Variance of the correlation time estimate is too large.)
     18.0         inf    5.68e-02  (Variance of the correlation time estimate is too large.)
     19.1         inf    6.05e-02  (Variance of the correlation time estimate is too large.)
     20.3         inf    6.44e-02  (Variance of the correlation time estimate is too large.)
     21.6         inf    6.85e-02  (Variance of the correlation time estimate is too large.)
     23.0         inf    7.30e-02  (Variance of the correlation time estimate is too large.)
     24.4         inf    7.77e-02  (Variance of the correlation time estimate is too large.)
     25.9         inf    8.27e-02  (Variance of the correlation time estimate is too large.)
     27.6         inf    8.80e-02  (Variance of the correlation time estimate is too large.)
     29.3         inf    9.37e-02  (Variance of the correlation time estimate is too large.)
     31.2         inf    9.97e-02  (Variance of the correlation time estimate is too large.)
     33.2         inf    1.06e-01  (Variance of the correlation time estimate is too large.)
     35.3         inf    1.13e-01  (Variance of the correlation time estimate is too large.)
     37.5         inf    1.20e-01  (Variance of the correlation time estimate is too large.)
     39.9         inf    1.28e-01  (Variance of the correlation time estimate is too large.)
     42.4         inf    1.36e-01  (Variance of the correlation time estimate is too large.)
     45.1         inf    1.45e-01  (Variance of the correlation time estimate is too large.)
     48.0         inf    1.54e-01  (Variance of the correlation time estimate is too large.)
     51.1         inf    1.64e-01  (Variance of the correlation time estimate is too large.)
     54.4         inf    1.75e-01  (Variance of the correlation time estimate is too large.)
     57.8         inf    1.86e-01  (Variance of the correlation time estimate is too large.)
     61.5         inf    1.98e-01  (Variance of the correlation time estimate is too large.)
     65.5         inf    2.11e-01  (Variance of the correlation time estimate is too large.)
     69.7         inf    2.25e-01  (Variance of the correlation time estimate is too large.)
     74.1         inf    2.39e-01  (Variance of the correlation time estimate is too large.)
     78.9         inf    2.55e-01  (Variance of the correlation time estimate is too large.)
     83.9         inf    2.71e-01  (Variance of the correlation time estimate is too large.)
     89.3         inf    2.89e-01  (Variance of the correlation time estimate is too large.)
     95.0         inf    3.07e-01  (Variance of the correlation time estimate is too large.)
    101.1         inf    3.27e-01  (Variance of the correlation time estimate is too large.)
    107.6         inf    3.48e-01  (Variance of the correlation time estimate is too large.)
    114.5         2.6    3.70e-01
    121.9         2.5    3.94e-01
    129.7         2.4    4.20e-01
    138.0         2.3    4.47e-01
    146.9         2.4    4.76e-01
    156.3         3.0    5.06e-01
    166.4         4.1    5.39e-01
    177.1         6.3    5.74e-01
    188.5        10.2    6.11e-01
    200.6        16.3    6.50e-01
    213.5        25.2    6.92e-01
    227.2        37.2    7.37e-01
    241.9        52.6    7.84e-01
    257.4        71.1    8.35e-01
    274.0        93.2    8.89e-01
    291.6       119.6    9.46e-01
Cutoff criterion exceeds incumbent + margin: 2.3 + 100.0.

INPUT TIME SERIES
    Time step:                     0.030 τ*
    Simulation time:               300.000 τ*
    Maximum degrees of freedom:    1000.0

MAIN RESULTS
    Autocorrelation integral:      3.245 ± 0.026 η*
    Integrated correlation time:   0.139 ± 0.001 τ*

SANITY CHECKS (weighted averages over cutoff grid)
    Effective number of points:    134.8 (ideally > 60)
    Regression cost Z-score:       -0.0 (ideally < 2)
    Cutoff criterion Z-score:      2.9 (ideally < 2)

MODEL lorentz(0.1) | CUTOFF CRITERION cv2l(125%)
    Number of parameters:          3
    Average cutoff frequency:      4.36e-01 1/τ*
    Exponential correlation time:  0.414 ± 0.036 τ*

RECOMMENDED SIMULATION SETTINGS (EXPONENTIAL CORR. TIME)
    Block time:                    0.130 ± 0.020 τ*
    Simulation time:               26.003 ± 0.288 τ*
../_images/08b5abbac7cc7f93085b60b546266520161f99060c32d82f6f340200d26b630a.svg ../_images/27d652419c15fb805f5991151c665f5eeae0eb18ad4b459beba5fcf7aa4834e3.svg

Some remarks about the final results:

  • The effective number of points has increased to 134.8, which is a fine number of data points for a model with \(P=3\) parameters.

  • For higher frequency cutoffs, both Z-scores increase, showing that the autocorrelation function only decays exponentially in the limit of large lag times. This is expected, since at sufficiently short time scales, the pressure tensor fluctuations are smooth functions, i.e. not featuring the cusp of a purely exponential ACF.

Comparison to Literature Results

Comprehensive literature surveys on computational estimates of the shear viscosity of a Lennard-Jones fluid can be found in [MLK04a] and [VSG07a]. These papers also present new results, which are included in the table below. Since the simulation settings (\(r_\text{cut}^{*}=2.5\), \(N=1372\), \(T^*=0.722\) and \(\rho^{*}=0.8442\)) are identical to those used in this notebook, the reported values should be directly comparable.

Method

Simulation time [τ*]

Shear viscosity [η*]

Reference

EMD NVE (STACIE)

3600

3.205 ± 0.063

(here) initial

EMD NVE (STACIE)

10800

3.207 ± 0.040

(here) extension 1

EMD NVE (STACIE)

30000

3.245 ± 0.026

(here) extension 2

EMD NVE (Helfand-Einstein)

75000

3.277 ± 0.098

[MLK04a]

EMD NVE (Helfand-moment)

600000

3.268 ± 0.055

[VSG07a]

This comparison confirms that STACIE can reproduce a well-known viscosity result, in line with literature results. To achieve a state-of-the-art statistical uncertainty, it requires far less simulation time. Even our longest production runs are still less than half as long as in the cited papers, and we achieve a much smaller uncertainties.

To be fair, the simulation time only accounts for production runs. Our setup also includes a significant amount of equilibration runs (3000 τ* in total) to ensure that different production runs are uncorrelated. Even when these additional runs are included, the overall simulation time remains significantly lower than in the cited papers.

Validation of the Production Runs

To further establish that our NVE runs together represent the NVT ensemble, the following two cells perform additional validation checks.

  • A plot of the conserved quantity of the separate NVE runs, to detect any drift.

  • The distribution of the instantaneous temperature, which should match the desired NVT distribution. For each individual NVE run and for the combined NVE runs, cumulative distributions are plotted. The function also plots the expected cumulative distribution of the NVT ensemble.

def plot_total_energy():
    # Load trajectory data.
    time = None
    energies = []
    for itraj in range(100):
        time_traj = []
        energies_traj = []
        for ipart in range(3):
            prod_dir = DATA_ROOT / f"replica_{itraj:04d}_part_{ipart:02d}/"
            data = np.loadtxt(prod_dir / "nve_thermo.txt")
            if time is None:
                time_traj.append(data[:, 0])
            energies_traj.append(data[:, 2:])
        if time is None:
            time = np.concatenate(time_traj)
        energies.append(energies_traj)
    energies = [np.concatenate(energy).T for energy in energies]

    # Plot the total energy of the NVE runs.
    plt.close("energyprod")
    _, ax = plt.subplots(num="energyprod")
    for kes, pes in energies:
        ax.plot(time, kes + pes)
    ax.set_xlabel("Time")
    ax.set_ylabel("Total Energy")
    ax.set_title("Total Energy of the NVE Runs")


plot_total_energy()
../_images/0d32378d6a0f64d7134bbd7ef290a749015be31fd85e6468ed68c236528fa63e.svg

There is no noticeable drift in the total energy of the NVE runs. Apart from the usual (and acceptable) numerical noise, the total energy is conserved perfectly.

def validate_temperature():
    """Plot cumulative distributions of the instantaneous temperature."""
    # Load the configuration from the YAML file.
    with open(DATA_ROOT / "replica_0000_part_00/info.yaml") as fh:
        info = safe_load(fh)
    temp_d = info["temperature"]
    ndof = info["natom"] * 3 - 3

    # Load trajectory data.
    temps = []
    for itraj in range(100):
        temps.append([])
        for ipart in range(3):
            prod_dir = DATA_ROOT / f"replica_{itraj:04d}_part_{ipart:02d}/"
            temps[-1].append(np.loadtxt(prod_dir / "nve_thermo.txt")[:, 1])
    temps = [np.concatenate(temp).T for temp in temps]

    # Plot the instantaneous and desired temperature distribution.
    plt.close("tempprod")
    _, ax = plt.subplots(num="tempprod")
    plot_cumulative_temperature_histogram(ax, temps, temp_d, ndof, "τ*")


validate_temperature()
../_images/173f4ffb8ddc155082bf38b7deeae7234b0c2e4aa865f521cfdae39866672840.svg

This plot offers detailed insight into NVE versus NVT temperature distributions:

  • In the NVE ensemble, the temperature distribution is relatively narrow. Hence, using a single NVE run would not be representative of the temperature variance of the NVT ensemble.

  • Some of the individual NVE runs have significantly lower ore higher temperatures than the average. If the transport property of interest has a nonlinear dependence on the temperature, this effect will lead to a shift in the estimated transport property, compared to using a single NVE run.

In the limit of macroscopic system sizes (\(N \rightarrow \infty\)), the NVE ensemble converges to the NVT ensemble. However, in simulations, one operates at finite system sizes, well below the thermodynamic limit.

Validation of the Independence of the Anistropic Contributions

Here we validate numerically that the five independent anisotropic contributions to the pressure tensor are indeed statistically independent. The covariance matrix of the anisotropic contributions is computed and the off-diagonal elements are plotted.

def validate_independence(ntraj: int = 100):
    """Validate the independence of the anisotropic contributions."""
    # Load trajectory data.
    pcomps_full = []
    for itraj in range(ntraj):
        pcomps_full.append([])
        for ipart in range(3):
            prod_dir = DATA_ROOT / f"replica_{itraj:04d}_part_{ipart:02d}/"
            pcomps_full[-1].append(np.loadtxt(prod_dir / "nve_pressure_blav.txt")[:, 1:])
    pcomps_aniso = [get_indep_paniso(np.concatenate(parts).T) for parts in pcomps_full]

    # Compute the average of the covariance matrix over all NVE trajectories.
    cov = np.mean([np.cov(p, ddof=0) for p in pcomps_aniso], axis=0)
    scale = abs(cov).max() * 1.05

    # Plot the covariance matrix.
    plt.close("covariance")
    _, ax = plt.subplots(num="covariance")
    im = ax.imshow(
        cov, cmap="coolwarm", vmin=-scale, vmax=scale, extent=[0.5, 5.5, 0.5, 5.5]
    )
    ax.set_title("Covariance of Anisotropic Pressure Contributions")
    ax.set_xlabel("Anisotropic Contribution $P'_i$")
    ax.set_ylabel("Anisotropic Contribution $P'_j$")
    plt.colorbar(im, ax=ax)


validate_independence()
../_images/595be584ce598982118a9fd59e2a83904f2a560f268523ae61a30f89841ca348.svg

The plot confirms that there is (at least visually) no sign of any statistical correlation between the anisotropic contributions. Note that one may perform more rigorous statistical tests to validate the independence of the five contributions. Here, we keep it simple for the sake of an intuitive demonstration.

Validation of the consistency of the Anisotropic Contributions

The following code cell shows that the five independent anisotropic contributions result in the same shear viscosity estimate, within the predicted uncertainties.

def validate_consistency():
    for i in range(5):
        result = analyze_production(3, select=i)
        eta = result.acint
        eta_std = result.acint_std
        print(f"Anisotropic contribution {i + 1}: η = {eta:.3f} ± {eta_std:.3f} η*")


validate_consistency()
Anisotropic contribution 1: η = 3.247 ± 0.053 η*
Anisotropic contribution 2: η = 3.193 ± 0.047 η*
Anisotropic contribution 3: η = 3.248 ± 0.052 η*
Anisotropic contribution 4: η = 3.183 ± 0.048 η*
Anisotropic contribution 5: η = 3.246 ± 0.052 η*

Note that one may perform more rigorous statistical tests to validate the consistency of the results. Here, we keep it simple for the sake of an intuitive demonstration.

Regression Tests

If you are experimenting with this notebook, you can ignore any exceptions below. The tests are only meant to pass for the notebook in its original form.

if abs(eta_production_init - 3.236) > 0.1:
    raise ValueError(f"wrong viscosity (production): {eta_production_init:.3e}")
if abs(eta_production_ext - 3.257) > 0.1:
    raise ValueError(f"wrong viscosity (production): {eta_production_ext:.3e}")