.. : YAFF is yet another force-field code. : Copyright (C) 2011 Toon Verstraelen , : Louis Vanduyfhuys , Center for Molecular Modeling : (CMM), Ghent University, Ghent, Belgium; all rights reserved unless otherwise : stated. : : This file is part of YAFF. : : YAFF is free software; you can redistribute it and/or : modify it under the terms of the GNU General Public License : as published by the Free Software Foundation; either version 3 : of the License, or (at your option) any later version. : : YAFF is distributed in the hope that it will be useful, : but WITHOUT ANY WARRANTY; without even the implied warranty of : MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the : GNU General Public License for more details. : : You should have received a copy of the GNU General Public License : along with this program; if not, see : : -- Trajectory Analysis ################### The analysis of the results is (in the first place) based on the HDF5 trajectory output files. On-line analysis (during the iterative algorithm, without writing data to disk) is also possible. As a final option, trajectory data from other programs may also be converted to the HDF5 format used by Yaff, which can then be post-processed as if it was generated by Yaff. In all the examples below, the following import statement is implicitly assumed:: import h5py as h5 Slicing the data ================ All the analysis routines below have at least the following four optional arguments: * ``start``: the first sample to consider for the analysis * ``step``: consider only a sample each ``step`` iterations. * ``max_sample``: consider at most ``max_sample`` number of samples. Some routines also support the following optional argument: * ``end``: the last sample to consider for the analysis. The ``max_sample`` option is only possible when ``step`` is not specified and the total number of samples (or ``end``) is known. When ``max_sample`` is given, the corresponding optimal value for ``step`` will be derived from ``max_sample``. Some analysis may not have the max_sample argument, e.g. the spectrum analysis, because the choice of the step size for such analysis is a critical parameter that needs to be set carefully. See :func:`yaff.analysis.utils.get_slice` for the documentation of the generic underlying routine that handles the slices. Basic analysis ============== A few basic analysis routines are provided to quickly check the sanity of an MD simulation: * ``plot_energies`` makes a plot of the kinetic and the total energy as function of time. For example:: f = h5.File('output.h5') plot_energies(f) f.close() makes a figure ``energies.png``. * ``plot_temperate`` and ``plot_pressure`` are similar, but they plot the temperature and pressure as function of time instead of the energies. * ``plot_temp_dist`` plots the distribution (both pdf and cdf) of the instantaneous atomic and system temperatures and compares these with the expected analytical result for a constant-temperature ensemble. For example:: f = h5.File('output.h5') plot_temp_dist(f) f.close() makes a figure ``temp_dist.png`` All these functions accept optional arguments to tune their behavior. See :mod:`yaff.analysis.basic` for more analysis routines and more details. Advanced analysis ================= Yaff also includes analysis tools that can extract relevant macroscopic properties from a simulation. These analysis tools require some additional computations that can either be done in a post-processing step, or on-line. Radial distribution functions (RDFs) ------------------------------------ The following example computes and oxygen-oxygen RDF using a previously generated trajectory file:: f = h5.File('output.h5') select = system.get_indexes('8') # select atoms with ATSELECT rule rdf = RDF(f, rcut=4.8*angstrom, rspacing=0.1*angstrom, max_sample=100, select0=select) rdf.plot() rdf.plot_crdf() f.close() In this example, the cutoff for the RDF is 4.8 Å and the spacing of the bins 0.1 Å. At most 100 samples are used to compute the RDF. The results are included in the HDF5 file, and optionally plotted using matplotlib. Alternatively, the same ``RDFAnalysis`` class can be used for on-line analysis, without the need to store huge amounts of data on disk:: select = system.get_indexes('O') rdf = RDF(None, 4.8*angstrom, 0.1*angstrom, max_sample=100, select0=select) verlet = VerletIntegrator(ff, hooks=rdf, temp0=300) verlet.run(5000) rdf.plot() rdf.plot_crdf() One may also construct RDFs based on two sets of particles, using two ``select*`` arguments. The following example computes the inter-particle RDF for the oxygen and hydrogen atoms, using a previously computed trajectory file:: f = h5.File('output.h5') select0 = system.get_indexes('8') # select atoms with ATSELECT rule select1 = system.get_indexes('1') rdf = RDF(f, rcut=4.8*angstrom, rspacing=0.1*angstrom, max_sample=100, select0=select0, select1=select1) rdf.plot() rdf.plot_crdf() f.close() The reference documentation of the ``rdf`` module (:mod:`yaff.analysis.rdf`) describes also the more exotic RDF options, e.g. to exclude intra-molecular distances from the RDF. Vibrational spectra and correlation times ----------------------------------------- The (efficient) computation of vibrational spectra and autocorrelation functions are so closely related that they are carried out simultaneously. A basic vibrational spectrum (by default based on the atomic velocities) is done as follows:: spectrum = Spectrum(f, bsize=512) spectrum.plot() spectrum.plot_ac() The ``bsize`` argument determines the size of the blocks used for the spectral analysis. The trajectory is cut into blocks of the given size. For each block, the spectrum is computed, and then averaged over all blocks. The ``plot`` method makes a figure of the spectrum. The ``plot_ac`` method makes a figure of the corresponding autocorrelation function. All the results are also available as attributes of the spectrum object. Similar to the RDF analysis, the spectrum can be computed both on-line and off-line. One can also estimate the IR spectrum as follows:: spectrum = Spectrum(f, bsize=512, path='trajectory/dipole_vel', key='ir') spectrum.plot() More details can be found in the reference documention: :mod:`yaff.analysis.spectrum`. Diffusion constants ------------------- The diffusion constant (based on the displacements of all nuclei) is computed as follows:: diff = Diffusion(f, step=10, mult=5) print diff.A diff.plot() The positions are subsampled with step size of 10. The ``mult`` argument controls between which time intervals are used to compute the mean-square displacements (MSDs) used to carry out a fit of the Einstein diffusion equation. When set to 5, the MSDs are computed at intervals of 10, 20, 30, 40 and 50 steps. As soon as the analysis is ready, a straight line is fitted through the MSD versus time data. The slope (``diff.A``) of this line corresponds to the diffusion constant. A plot of these data and the fitted line can be made with ``diff.plot()``. One may restrict the analysis to just a selection of atoms as follows:: diff = Diffusion(f, step=10, mult=5, select=[0,4,5]) See :mod:`yaff.analysis.diffusion` for more details. **TODO.** Explain (and double check) how to compute diffusion constants of multiple molecules instead of just atoms. Computing errors on thermodynamic averages ------------------------------------------ The block average method can be used to compute the error on time-dependent data that have some degree of auto-correlation. The following example computes the error on the average temperature from an MD simulation, using a HDF5 trajectory file. :: f = h5.file('output.h5') error, sinef = blav(f['trajectory/temperature']) f.close() Two values are returned. ``error`` is the error on the average. ``sinef`` is the statistical sampling inefficiency. More details about this routine are given here: :mod:`yaff.sampling.blav`. Post-processing external trajectory data ======================================== One may also use the analysis module of Yaff to process trajectories generated with other molecular simulation codes. The following example script (see ``data/examples/002_external_trajectory/rdf.py``) converts the xyz file to HDF5 data, which is then used to perform the analysis .. literalinclude:: ../yaff/examples/002_external_trajectory/rdf.py :lines: 26- See :mod:`yaff.conversion` for more routines to convert trajectory data.