Input/output

It is not the intention to duplicate the work done in the OpenBabel project. Several file formats below are rather specific for molecular dynamics simulations and not all of them deal with representing molecular systems. The scope is more generic. The selection of supported formats is purely driven by the convenience of the MolMod developers.

molmod.io.atrj – ATRJ Files

class molmod.io.atrj.ATRJReader(f, sub=slice(None, None, None))

A Reader class for the ATRJ trajectory file format

Once initialized, this object behaves like an iterator that runs over the individual time frames. Example usage:

>>> foo = ATRJReader("somefile.atrj")
>>> print foo.num_atoms
>>> for frame in foo:
...    print frame.time
Arguments:
f – a filename or a file-like object
sub – an optional slice object to select a subset of time frames

The number of atoms is read immediately and is available as self.num_atoms.

class molmod.io.atrj.ATRJFrame

A single time frame from an ATRJ file

molmod.io.cml – CML Files

Not all features of the CML standard are supported in this module, only the basic aspects that are relevant for computational chemistry. For more info on the CML format, visit: http://cml.sourceforge.net/

In this module, only atoms, their 3D coordinates, atom numbers, bonds and bond orders are supported.

molmod.io.cml.load_cml(cml_filename)

Load the molecules from a CML file

Argument:
cml_filename – The filename of a CML file.

Returns a list of molecule objects with optional molecular graph attribute and extra attributes.

molmod.io.cml.dump_cml(f, molecules)

Write a list of molecules to a CML file

Arguments:
f – a filename of a CML file or a file-like object
molecules – a list of molecule objects.

molmod.io.cp2k – CP2K Files

class molmod.io.cp2k.CP2KSection(name='', children=None, section_parameters='')

Data structure representing a section in a CP2K input file

Instances of this class are iterable and support full element access.

Optional arguments:
name – the name of the section
children – a list of CP2KSection and CP2KKeyword objects
section_parameters – the value assigned to the section
append(child)

Add a child section or keyword

dump(f, indent='')

Dump this section and its children to a file-like object

dump_children(f, indent='')

Dump the children of the current section to a file-like object

get_name()

Return the name of this section

insert(index, child)

Insert a child section or keyword

load(f, line=None)

Load this section from a file-like object

load_children(f)

Load the children of this section from a file-like object

name

Return the name of this section

readline(f)

A helper method that only reads uncommented lines

class molmod.io.cp2k.CP2KKeyword(name='', value='', unit=None)

Data structure representing a keyword-value pair in a CP2K input file

Optional arguments:
name – The keyword name
value – The associated value
unit – The unit (user must guarantee validity of the unit)
dump(f, indent='')

Dump this keyword to a file-like object

get_name()

Get the name of the keyword

get_unit()

Get the name of the unit

get_value()

Get the value of the keyword

load(line)

Load this keyword from a file-like object

name

Get the name of the keyword

set_value(value)

Set the value associated with the keyword

unit

Get the name of the unit

value

Get the value of the keyword

class molmod.io.cp2k.CP2KInputFile(children=None)

Data structure representing an entire CP2K input file

Initialize a new (empty) CP2KInputFile object

dump(f, indent='')

Dump the CP2KInput data structure to a file-like object

static read_from_file(filename)
Arguments:
filename – the filename of the input file

Use as follows:

>>> if = CP2KInputFile.read_from_file("somefile.inp")
>>> for section in if:
...     print section.name
write_to_file(filename)

Write the CP2KInput data structure to a file

molmod.io.cpmd – CPMD Files

class molmod.io.cpmd.CPMDTrajectoryReader(f, sub=slice(None, None, None))

A Reader for CPMD trajectory files

Use this reader as an iterator:

>>> ldr = CPMDTrajectoryReader("TRAJECTORY")
>>> for pos, vel in ldr:
...     print pos[4,2]
Arguments:
f – a filename or a file-like object
Optional argument:
sub – a slice object indicating which time frames to skip/read

molmod.io.gamess – Gamess punch files

class molmod.io.gamess.PunchFile(filename)

Reader for GAMESS Punch files.

After initialization, the data from the file is available as attributes of the created object. The following attributes are created if the corresponding data is present in the punch file: title, symmetry, symbols, numbers, coordinates, energy, gradient and hessian.

Argument:
filename – The file to load from.

molmod.io.cube – Gaussian Cube Files

molmod.io.cube.get_cube_points(origin, axes, nrep)

Generate the Cartesian coordinates of the points in a cube file

Arguemnts:

origin
The cartesian coordinate for the origin of the grid.
axes
The 3 by 3 array with the grid spacings as rows.
nrep
The number of grid points along each axis.
class molmod.io.cube.CubeReader(filename)

Iterator that reads cube files. See the cubegen manual for more information about cube files: http://www.gaussian.com/g_tech/g_ur/u_cubegen.htm

Use this object as an iterator:

>>> cr = CubeReader("test.cube")
>>> print cr.numbers
>>> for vector, value in cr:
...     print vector, value
Argument:
filename – the filename with the formatted cube data
class molmod.io.cube.Cube(molecule, origin, axes, nrep, data, subtitle='', nuclear_charges=None)

A data structure for cube file data.

Arguments:

molecule
A Molecule instance.
origin
The cartesian coordinate for the origin of the grid.
axes
The 3 by 3 array with the grid spacings as rows.
nrep
The number of grid points along each axis.
subtitle
The title on the second line in the cube file.
nuclear_charges
The nuclear charges, can be different from the atomic numbers in case of effective core potentials.
copy(newdata=None)

Return a copy of the cube with optionally new data.

classmethod from_file(filename)

Create a cube object by loading data from a file.

Arguemnts:

filename
The file to load. It must contain the header with the description of the grid and the molecule.
get_points()

Return a Nz*Nb*Nc*3 array with all Cartesian coordinates of the points in the cube.

write_to_file(fn)

Write the cube to a file in the Gaussian cube format.

molmod.io.dlpoly – DLPoly Files

class molmod.io.dlpoly.DLPolyHistoryReader(f, sub=slice(None, None, None), pos_unit=1.889726133921252, vel_unit=4.5710289267195566e-05, frc_unit=2.015529715950126e-06, time_unit=41341.37333664682, mass_unit=1822.8886273532887, restart=False)

A Reader for the DLPoly history file format.

Use this object as an iterator:

>>> hr = HistoryReader("somefile.txt")
>>> for frame in hr:
...     print frame["cell"]
Arguments:
f – a filename or a file-like object
Optional arguments:
sub – a slice indicating the frames to be skipped/selected
pos_unit, vel_unit, frc_unit, time_unit, mass_unit – The conversion factors for the unit conversion from the units in the data file to atomic units. The defaults of these optional arguments correspond to the defaults of dlpoly.

When the file starts with a line that satisfies the following conditions, it is assumed that this is a history restart file:

  • line consists of 6 words
  • first word equals ‘timestep’
  • the following for words are integers
  • the last word is a float
class molmod.io.dlpoly.DLPolyOutputReader(f, sub=slice(None, None, None), skip_equi_period=True, pos_unit=1.889726133921252, time_unit=41341.37333664682, angle_unit=0.017453292519943295, e_unit=872433224359.0566)

A Reader for DLPoly output files.

Use this object as an iterator:

>>> outr = OutputReader("somefile.txt")
>>> for row in outr:
...     print row[5]

The variable row in the example above is a concatenation of all the values that belong to one time frame. (line after line)

Arguments:
f – a filename or a file-like object
Optional arguments:
sub – a slice indicating the frames to be skipped/selected
skip_equi_period – When True, the equilibration period is not read [default=True]
pos_unit, time_unit, angle_unit, e_unit – The conversion factors for the unit conversion from the units in the data file to atomic units. The defaults of these optional arguments correspond to the defaults of dlpoly.
goto_next_frame()

Continue reading until the next frame is reached

molmod.io.fchk – Gaussian Formatted Checkpoint Files

class molmod.io.fchk.FCHKFile(filename, ignore_errors=False, field_labels=None)

Reader for Formatted checkpoint files

After initialization, the data from the file is available in the fields dictionary. Also the following attributes are read from the file: title, command, lot (level of theory) and basis.

Arguments:
filename – The formatted checkpoint file
Optional arguments:
ignore_errors – Try to read incorrectly formatted files without raising exceptions [default=False]
field_labels – When given, only these fields are read from the formatted checkpoint file. (This can save a lot of time.)
get_hessian()

Return the hessian

get_optimization_coordinates()

Return the coordinates of the geometries at each point in the optimization

get_optimization_energies()

Return an array with the energy at each point in the optimization

get_optimization_gradients()

Return the energy gradients of all geometries during an optimization

get_optimization_lowest_index()

Return the index of the lowest energy during the optimization process

get_optimized_energy()

Get the final energy after optimization

get_optimized_gradient()

Return the energy gradient of the optimized geometry

get_optimized_molecule()

Return a molecule object of the optimal geometry

molmod.io.number_state – Custom Formatted Checkpoint Files

class molmod.io.number_state.NumberState(owner, names)

Component class for data structures with human-readable persistence.

The format used to save and load the object is similar to a formatted checkpoint file from the Gaussian package. Some additional info is stored such as the shape of an array and the exact data type of the array elements.

The attributes that contain data to be read from or to be written to files are set up in the constructor of the owner class. This is a typical simple example:

>>> class Foo(object):
...     def __init__(self, a, b):
...         self.a = a
...         self.b = b
...         self.state = NumberState(self, ["a", "b"])

In this example a is an array and b is a single scalar. One can now read/write these attributes to a file as follows:

>>> foo = Foo(a, b)
>>> foo.state.dump("somefile.txt")
>>> foo.state.load("somefile.txt")
Arguments:
owner – the object whose attributes are dumped and loaded
names – a list of attribute names to dump and load
dump(filename)

Dump the registered fields to a file

Argument:
filename – the file to write to
get(subset=None)

Return a dictionary object with the registered fields and their values

Optional rgument:
subset – a list of names to restrict the number of fields in the result
load(filename, subset=None)

Load data into the registered fields

Argument:
filename – the filename to read from
Optional argument:
subset – a list of field names that are read from the file. If not given, all data is read from the file.
set(new_fields, subset=None)

Assign the registered fields based on a dictionary

Argument:
new_fields – the dictionary with the data to be assigned to the attributes
Optional argument:
subset – a list of names to restrict the fields that are effectively overwritten

molmod.io.gromacs – Gromacs Files

class molmod.io.gromacs.GroReader(f, sub=slice(None, None, None))

A reader from .gro trajectory files from gromacs

Use this reader as an iterator:

>>> gr = GroReader("somefile.gro")
>>> for time, pos, vel, cell in gr:
...     print pos
Argument:
f – a filename or a file-like object
Optional argument:
sub – a slice object to indicate the frames to be read/skipped.

molmod.io.lammps – LAMMPS Files

class molmod.io.lammps.LAMMPSDumpReader(f, units, sub=slice(None, None, None))

A Reader for LAMMPS dump files

Use this reader as an iterator:

>>> ldr = LAMMPSDumpReader("some_file.dump", [some, units])
>>> for fields in ldr:
...     print fields[0]
Arguments:
f – a filename or a file-like object
units – The units of the atom fields. The number of fields, their unit and their meaning depends on the input file of the LAMMPS simulation.
Optional argumtent:
sub – a slice object indicating which time frames to skip/read

molmod.io.pdb – PDB Files

molmod.io.pdb.load_pdb(filename)

Loads a single molecule from a pdb file.

This function does support only a small fragment from the pdb specification. It assumes that there is only one molecular geometry in the pdb file.

molmod.io.pdb.dump_pdb(filename, molecule, atomnames=None, resnames=None, chain_ids=None, occupancies=None, betas=None)

Writes a single molecule to a pdb file.

This function is based on the pdb file specification: http://www.wwpdb.org/documentation/format32/sect9.html For convenience, the relevant table is copied and the character indexes are transformed to C-style (starting from zero)

COLUMNS DATA TYPE FIELD DEFINITION
0 - 5 Record name “ATOM “  
6 - 10 Integer serial Atom serial number.
12 - 15 Atom name Atom name.
16 Character altLoc Alternate location indicator.
17 - 19 Residue name resName Residue name.
21 Character chainID Chain identifier.
22 - 25 Integer resSeq Residue sequence number.
26 AChar iCode Code for insertion of residues.
30 - 37 Real(8.3) x Orthogonal coordinates for X in Angstroms.
38 - 45 Real(8.3) y Orthogonal coordinates for Y in Angstroms.
46 - 53 Real(8.3) z Orthogonal coordinates for Z in Angstroms.
54 - 59 Real(6.2) occupancy Occupancy.
60 - 65 Real(6.2) tempFactor Temperature factor.
76 - 77 LString(2) element Element symbol, right-justified.
78 - 79 LString(2) charge Charge on the atom.

molmod.io.psf – PSF Files

This format is orignally developed in conjunction with the CHARMM program, but is also used by CP2K as a generic format to define the molecular bond graph and other topological aspects of a molecular model. This module just creates files that can be read with CP2K.

Due to the lack of public definition of a PSF file format, several variations exist. Therefore we do not make any attempt to follow the original format strictly, nor one of these variations. It would be more interesting to define a new public standard for molecular topologies, which is also suitable for non-protein systems.

class molmod.io.psf.PSFFile(filename=None)

A very simplistic and limited implementation of the PSF file format

Argument:
filename – When not given, an empty data structure is created, otherwise the file is loaded from disk
add_molecular_graph(molecular_graph, atom_types=None, charges=None, split=True, molecule=None)

Add the molecular graph to the data structure

Argument:
molecular_graph – a MolecularGraph instance
Optional arguments:
atom_types – a list with atom type strings
charges – The net atom charges
split – When True, the molecule is split into disconnected molecules [default=True]
add_molecule(molecule, atom_types=None, charges=None, split=True)

Add the graph of the molecule to the data structure

The molecular graph is estimated from the molecular geometry based on interatomic distances.

Argument:
molecule – a Molecule instance
Optional arguments:
atom_types – a list with atom type strings
charges – The net atom charges
split – When True, the molecule is split into disconnected molecules [default=True]
clear()

Clear the contents of the data structure

dump(f)

Dump the data structure to a file-like object

get_graph()

Return the bond graph represented by the data structure

get_groups()

Return a list of groups of atom indexes

Each atom in a group belongs to the same molecule or residue.

get_molecular_graph()

Return the molecular graph represented by the data structure

read_from_file(filename)

Load a PSF file

write_to_file(filename)

Write the data structure to a file

molmod.io.sdf – SDF Files

For more information about the file format, see http://www.epa.gov/ncct/dsstox/MoreonSDF.html

class molmod.io.sdf.SDFReader(f)

A basic reader for SDF files.

Use this reader as an iterator:

>>> sr = SDFReader("somefile.sdf")
>>> for mol in sr:
...     print mol.title
Argument:
f – a filename or a file-like object

molmod.io.xyz – XYZ Files

class molmod.io.xyz.XYZReader(f, sub=slice(None, None, None), file_unit=1.889726133921252)

A reader for XYZ trajectory files

Use this reader as an iterator:

>>> xr = XYZReader("somefile.xyz")
>>> for title, coordinates in xr:
        print title

Initialize an XYZ reader

Arguments:
f – a filename or a file-like object
Optional arguments:
sub – a slice indicating which frames to read/skip
file_unit – the conversion constant to convert data into atomic units [default=angstrom]
After initialization, the following attributes are defined:
symbols – The atom symbols
numbers – The atom numbers
get_first_molecule()

Get the first molecule from the trajectory

This can be useful to configure your program before handeling the actual trajectory.

read_size()

Read the number of atoms

class molmod.io.xyz.XYZWriter(f, symbols, file_unit=1.889726133921252)

A XYZ trajectory file writer

This writer is designed to be used in conjunction with an iterator that generates coordinates in one way or the other. Example:

>>> xr = XYZReader("somefile.xyz")
>>> xw = XYZWriter("otherfile.xyz", xr.symbols[5:10])
>>> for title, coordinates in xr:
...    xw.dump(title, -coordinates[5:10])
Arguments:
f – a filename or a file-like object to write to
symbols – the atom symbols
Optional argument
file_unit – the unit of the values written to file [default=angstrom]
dump(title, coordinates)

Dump a frame to the trajectory file

Arguments:
title – the title of the frame
coordinates – a numpy array with coordinates in atomic units
class molmod.io.xyz.XYZFile(f, sub=slice(None, None, None), file_unit=1.889726133921252)

Data structure representing an XYZ File

This implementation is extra layer on top the XYZReader and XYZWriter, and is somewhat easier to use. Example:

>>> xyz_file = XYZFile("some.xyz")
>>> mol = xyz_file.get_molecule(3)
>>> print mol.title
>>> xyz_file.geometries[0, 4, 2] = 5.0 # frame 0, atom 4, Z-coordinate
>>> xyz_file.write_to_file("other.xyz")

Initialize an XYZFile object

Argument:
f – a filename or a file-like object to read from
Optional arguments:
sub – a slice indicating which frames to read/skip
file_unit – the conversion constant to convert data into atomic units [default=angstrom]
XYZFile instances always have to following attriubtes:
numbers – The atom numbers (of one frame)
symbols – The atom symbols (of one frame)
titles – The titles of all the frames
geometries – A MxNx3 array with all the atom coordinates
get_molecule(index=0)

Get a molecule from the trajectory

Optional argument:
index – The frame index [default=0]
write_to_file(f, file_unit=1.889726133921252)

Write the trajectory to a file

Argument:
f – a filename or a file-like object to write to
Optional argument:
file_unit – the unit of the values written to file [default=angstrom]

molmod.io.common – Common routines

molmod.io.common.slice_match(sub, counter)

Efficiently test if counter is in xrange(*sub)

Arguments:
sub – a slice object
counter – an integer

The function returns True if the counter is in xrange(sub.start, sub.stop, sub.step).

exception molmod.io.common.FileFormatError

Is raised when unexpected data is encountered while reading a file

class molmod.io.common.SlicedReader(f, sub=slice(None, None, None))

Base class for readers that can read a slice of all the frames

Argument:
f – a filename or a file-like object
Optional argument:
sub – a slice indicating which frames to read/skip