..
: YAFF is yet another force-field code.
: Copyright (C) 2011 Toon Verstraelen ,
: Louis Vanduyfhuys , Center for Molecular Modeling
: (CMM), Ghent University, Ghent, Belgium; all rights reserved unless otherwise
: stated.
:
: This file is part of YAFF.
:
: YAFF is free software; you can redistribute it and/or
: modify it under the terms of the GNU General Public License
: as published by the Free Software Foundation; either version 3
: of the License, or (at your option) any later version.
:
: YAFF is distributed in the hope that it will be useful,
: but WITHOUT ANY WARRANTY; without even the implied warranty of
: MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
: GNU General Public License for more details.
:
: You should have received a copy of the GNU General Public License
: along with this program; if not, see
:
: --
Trajectory Analysis
###################
The analysis of the results is (in the first place) based on the HDF5 trajectory
output files. On-line analysis (during the iterative algorithm, without
writing data to disk) is also possible. As a final option, trajectory data
from other programs may also be converted to the HDF5 format used by Yaff, which
can then be post-processed as if it was generated by Yaff.
In all the examples below, the following import statement is implicitly
assumed::
import h5py as h5
Slicing the data
================
All the analysis routines below have at least the following four optional
arguments:
* ``start``: the first sample to consider for the analysis
* ``step``: consider only a sample each ``step`` iterations.
* ``max_sample``: consider at most ``max_sample`` number of samples.
Some routines also support the following optional argument:
* ``end``: the last sample to consider for the analysis.
The ``max_sample`` option is only possible when ``step`` is not specified and
the total number of samples (or ``end``) is known. When ``max_sample`` is given,
the corresponding optimal value for ``step`` will be derived from
``max_sample``. Some analysis may not have the max_sample argument, e.g. the
spectrum analysis, because the choice of the step size for such analysis is a
critical parameter that needs to be set carefully.
See :func:`yaff.analysis.utils.get_slice` for the documentation of the generic
underlying routine that handles the slices.
Basic analysis
==============
A few basic analysis routines are provided to quickly check the sanity of an MD
simulation:
* ``plot_energies`` makes a plot of the kinetic and the total energy as function
of time. For example::
f = h5.File('output.h5')
plot_energies(f)
f.close()
makes a figure ``energies.png``.
* ``plot_temperate`` and ``plot_pressure`` are similar, but they plot the
temperature and pressure as function of time instead of the energies.
* ``plot_temp_dist`` plots the distribution (both pdf and cdf) of the
instantaneous atomic and system temperatures and compares these with the
expected analytical result for a constant-temperature ensemble. For example::
f = h5.File('output.h5')
plot_temp_dist(f)
f.close()
makes a figure ``temp_dist.png``
All these functions accept optional arguments to tune their behavior. See
:mod:`yaff.analysis.basic` for more analysis routines and more details.
Advanced analysis
=================
Yaff also includes analysis tools that can extract relevant macroscopic
properties from a simulation. These analysis tools require some additional
computations that can either be done in a post-processing step, or on-line.
Radial distribution functions (RDFs)
------------------------------------
The following example computes and oxygen-oxygen RDF using a previously
generated trajectory file::
f = h5.File('output.h5')
select = system.get_indexes('8') # select atoms with ATSELECT rule
rdf = RDF(f, rcut=4.8*angstrom, rspacing=0.1*angstrom, max_sample=100, select0=select)
rdf.plot()
rdf.plot_crdf()
f.close()
In this example, the cutoff for the RDF is 4.8 Å and the spacing of the
bins 0.1 Å. At most 100 samples are used to compute the RDF. The results are
included in the HDF5 file, and optionally plotted using matplotlib.
Alternatively, the same ``RDFAnalysis`` class can be used for on-line
analysis, without the need to store huge amounts of data on disk::
select = system.get_indexes('O')
rdf = RDF(None, 4.8*angstrom, 0.1*angstrom, max_sample=100, select0=select)
verlet = VerletIntegrator(ff, hooks=rdf, temp0=300)
verlet.run(5000)
rdf.plot()
rdf.plot_crdf()
One may also construct RDFs based on two sets of particles, using two ``select*``
arguments. The following example computes the inter-particle RDF for the oxygen
and hydrogen atoms, using a previously computed trajectory file::
f = h5.File('output.h5')
select0 = system.get_indexes('8') # select atoms with ATSELECT rule
select1 = system.get_indexes('1')
rdf = RDF(f, rcut=4.8*angstrom, rspacing=0.1*angstrom, max_sample=100,
select0=select0, select1=select1)
rdf.plot()
rdf.plot_crdf()
f.close()
The reference documentation of the ``rdf`` module (:mod:`yaff.analysis.rdf`)
describes also the more exotic RDF options, e.g. to exclude intra-molecular
distances from the RDF.
Vibrational spectra and correlation times
-----------------------------------------
The (efficient) computation of vibrational spectra and autocorrelation
functions are so closely related that they are carried out simultaneously. A
basic vibrational spectrum (by default based on the atomic velocities) is done
as follows::
spectrum = Spectrum(f, bsize=512)
spectrum.plot()
spectrum.plot_ac()
The ``bsize`` argument determines the size of the blocks used for the
spectral analysis. The trajectory is cut into blocks of the given size. For
each block, the spectrum is computed, and then averaged over all blocks. The
``plot`` method makes a figure of the spectrum. The ``plot_ac`` method makes
a figure of the corresponding autocorrelation function. All the results are
also available as attributes of the spectrum object. Similar to the RDF
analysis, the spectrum can be computed both on-line and off-line. One can
also estimate the IR spectrum as follows::
spectrum = Spectrum(f, bsize=512, path='trajectory/dipole_vel', key='ir')
spectrum.plot()
More details can be found in the reference documention:
:mod:`yaff.analysis.spectrum`.
Diffusion constants
-------------------
The diffusion constant (based on the displacements of all nuclei) is computed
as follows::
diff = Diffusion(f, step=10, mult=5)
print diff.A
diff.plot()
The positions are subsampled with step size of 10. The ``mult`` argument controls
between which time intervals are used to compute the mean-square displacements
(MSDs) used to carry out a fit of the Einstein diffusion equation. When set to
5, the MSDs are computed at intervals of 10, 20, 30, 40 and 50 steps. As soon
as the analysis is ready, a straight line is fitted through the MSD versus time
data. The slope (``diff.A``) of this line corresponds to the diffusion constant.
A plot of these data and the fitted line can be made with ``diff.plot()``.
One may restrict the analysis to just a selection of atoms as follows::
diff = Diffusion(f, step=10, mult=5, select=[0,4,5])
See :mod:`yaff.analysis.diffusion` for more details.
**TODO.** Explain (and double check) how to compute diffusion constants of
multiple molecules instead of just atoms.
Computing errors on thermodynamic averages
------------------------------------------
The block average method can be used to compute the error on time-dependent
data that have some degree of auto-correlation. The following example computes
the error on the average temperature from an MD simulation, using a HDF5
trajectory file. ::
f = h5.file('output.h5')
error, sinef = blav(f['trajectory/temperature'])
f.close()
Two values are returned. ``error`` is the error on the average. ``sinef`` is the
statistical sampling inefficiency. More details about this routine are given
here: :mod:`yaff.sampling.blav`.
Post-processing external trajectory data
========================================
One may also use the analysis module of Yaff to process trajectories generated
with other molecular simulation codes. The following example script (see
``data/examples/002_external_trajectory/rdf.py``) converts the xyz file to HDF5 data,
which is then used to perform the analysis
.. literalinclude:: ../yaff/examples/002_external_trajectory/rdf.py
:lines: 26-
See :mod:`yaff.conversion` for more routines to convert trajectory data.