Input/output¶
It is not the intention to duplicate the work done in the OpenBabel project. Several file formats below are rather specific for molecular dynamics simulations and not all of them deal with representing molecular systems. The scope is more generic. The selection of supported formats is purely driven by the convenience of the MolMod developers.
molmod.io.atrj
– ATRJ Files¶
-
class
molmod.io.atrj.
ATRJReader
(f, sub=slice(None, None, None))¶ A Reader class for the ATRJ trajectory file format
Once initialized, this object behaves like an iterator that runs over the individual time frames. Example usage:
>>> foo = ATRJReader("somefile.atrj") >>> print foo.num_atoms >>> for frame in foo: ... print frame.time
- Arguments:
f
– a filename or a file-like objectsub
– an optional slice object to select a subset of time frames
The number of atoms is read immediately and is available as
self.num_atoms
.
-
class
molmod.io.atrj.
ATRJFrame
¶ A single time frame from an ATRJ file
molmod.io.cml
– CML Files¶
Not all features of the CML standard are supported in this module, only the basic aspects that are relevant for computational chemistry. For more info on the CML format, visit: http://cml.sourceforge.net/
In this module, only atoms, their 3D coordinates, atom numbers, bonds and bond orders are supported.
-
molmod.io.cml.
load_cml
(cml_filename)¶ Load the molecules from a CML file
- Argument:
cml_filename
– The filename of a CML file.
Returns a list of molecule objects with optional molecular graph attribute and extra attributes.
-
molmod.io.cml.
dump_cml
(f, molecules)¶ Write a list of molecules to a CML file
- Arguments:
f
– a filename of a CML file or a file-like objectmolecules
– a list of molecule objects.
molmod.io.cp2k
– CP2K Files¶
-
class
molmod.io.cp2k.
CP2KSection
(name='', children=None, section_parameters='')¶ Data structure representing a section in a CP2K input file
Instances of this class are iterable and support full element access.
- Optional arguments:
name
– the name of the sectionchildren
– a list of CP2KSection and CP2KKeyword objectssection_parameters
– the value assigned to the section
-
append
(child)¶ Add a child section or keyword
-
dump
(f, indent='')¶ Dump this section and its children to a file-like object
-
dump_children
(f, indent='')¶ Dump the children of the current section to a file-like object
-
get_name
()¶ Return the name of this section
-
insert
(index, child)¶ Insert a child section or keyword
-
load
(f, line=None)¶ Load this section from a file-like object
-
load_children
(f)¶ Load the children of this section from a file-like object
-
name
¶ Return the name of this section
-
readline
(f)¶ A helper method that only reads uncommented lines
-
class
molmod.io.cp2k.
CP2KKeyword
(name='', value='', unit=None)¶ Data structure representing a keyword-value pair in a CP2K input file
- Optional arguments:
name
– The keyword namevalue
– The associated valueunit
– The unit (user must guarantee validity of the unit)
-
dump
(f, indent='')¶ Dump this keyword to a file-like object
-
get_name
()¶ Get the name of the keyword
-
get_unit
()¶ Get the name of the unit
-
get_value
()¶ Get the value of the keyword
-
load
(line)¶ Load this keyword from a file-like object
-
name
¶ Get the name of the keyword
-
set_value
(value)¶ Set the value associated with the keyword
-
unit
¶ Get the name of the unit
-
value
¶ Get the value of the keyword
-
class
molmod.io.cp2k.
CP2KInputFile
(children=None)¶ Data structure representing an entire CP2K input file
Initialize a new (empty) CP2KInputFile object
-
dump
(f, indent='')¶ Dump the CP2KInput data structure to a file-like object
-
static
read_from_file
(filename)¶ - Arguments:
filename
– the filename of the input file
Use as follows:
>>> if = CP2KInputFile.read_from_file("somefile.inp") >>> for section in if: ... print section.name
-
write_to_file
(filename)¶ Write the CP2KInput data structure to a file
-
molmod.io.cpmd
– CPMD Files¶
-
class
molmod.io.cpmd.
CPMDTrajectoryReader
(f, sub=slice(None, None, None))¶ A Reader for CPMD trajectory files
Use this reader as an iterator:
>>> ldr = CPMDTrajectoryReader("TRAJECTORY") >>> for pos, vel in ldr: ... print pos[4,2]
- Arguments:
f
– a filename or a file-like object- Optional argument:
sub
– a slice object indicating which time frames to skip/read
molmod.io.gamess
– Gamess punch files¶
-
class
molmod.io.gamess.
PunchFile
(filename)¶ Reader for GAMESS Punch files.
After initialization, the data from the file is available as attributes of the created object. The following attributes are created if the corresponding data is present in the punch file:
title
,symmetry
,symbols
,numbers
,coordinates
,energy
,gradient
andhessian
.- Argument:
filename
– The file to load from.
molmod.io.cube
– Gaussian Cube Files¶
-
molmod.io.cube.
get_cube_points
(origin, axes, nrep)¶ Generate the Cartesian coordinates of the points in a cube file
Arguemnts:
- origin
- The cartesian coordinate for the origin of the grid.
- axes
- The 3 by 3 array with the grid spacings as rows.
- nrep
- The number of grid points along each axis.
-
class
molmod.io.cube.
CubeReader
(filename)¶ Iterator that reads cube files. See the cubegen manual for more information about cube files: http://www.gaussian.com/g_tech/g_ur/u_cubegen.htm
Use this object as an iterator:
>>> cr = CubeReader("test.cube") >>> print cr.numbers >>> for vector, value in cr: ... print vector, value
- Argument:
filename
– the filename with the formatted cube data
-
class
molmod.io.cube.
Cube
(molecule, origin, axes, nrep, data, subtitle='', nuclear_charges=None)¶ A data structure for cube file data.
Arguments:
- molecule
- A Molecule instance.
- origin
- The cartesian coordinate for the origin of the grid.
- axes
- The 3 by 3 array with the grid spacings as rows.
- nrep
- The number of grid points along each axis.
- subtitle
- The title on the second line in the cube file.
- nuclear_charges
- The nuclear charges, can be different from the atomic numbers in case of effective core potentials.
-
copy
(newdata=None)¶ Return a copy of the cube with optionally new data.
-
classmethod
from_file
(filename)¶ Create a cube object by loading data from a file.
Arguemnts:
- filename
- The file to load. It must contain the header with the description of the grid and the molecule.
-
get_points
()¶ Return a Nz*Nb*Nc*3 array with all Cartesian coordinates of the points in the cube.
-
write_to_file
(fn)¶ Write the cube to a file in the Gaussian cube format.
molmod.io.dlpoly
– DLPoly Files¶
-
class
molmod.io.dlpoly.
DLPolyHistoryReader
(f, sub=slice(None, None, None), pos_unit=1.889726133921252, vel_unit=4.5710289267195566e-05, frc_unit=2.015529715950126e-06, time_unit=41341.37333664682, mass_unit=1822.8886273532887, restart=False)¶ A Reader for the DLPoly history file format.
Use this object as an iterator:
>>> hr = HistoryReader("somefile.txt") >>> for frame in hr: ... print frame["cell"]
- Arguments:
f
– a filename or a file-like object- Optional arguments:
sub
– a slice indicating the frames to be skipped/selectedpos_unit
,vel_unit
,frc_unit
,time_unit
,mass_unit
– The conversion factors for the unit conversion from the units in the data file to atomic units. The defaults of these optional arguments correspond to the defaults of dlpoly.
When the file starts with a line that satisfies the following conditions, it is assumed that this is a history restart file:
- line consists of 6 words
- first word equals ‘timestep’
- the following for words are integers
- the last word is a float
-
class
molmod.io.dlpoly.
DLPolyOutputReader
(f, sub=slice(None, None, None), skip_equi_period=True, pos_unit=1.889726133921252, time_unit=41341.37333664682, angle_unit=0.017453292519943295, e_unit=872433224359.0566)¶ A Reader for DLPoly output files.
Use this object as an iterator:
>>> outr = OutputReader("somefile.txt") >>> for row in outr: ... print row[5]
The variable row in the example above is a concatenation of all the values that belong to one time frame. (line after line)
- Arguments:
f
– a filename or a file-like object- Optional arguments:
sub
– a slice indicating the frames to be skipped/selectedskip_equi_period
– When True, the equilibration period is not read [default=True]pos_unit
,time_unit
,angle_unit
,e_unit
– The conversion factors for the unit conversion from the units in the data file to atomic units. The defaults of these optional arguments correspond to the defaults of dlpoly.
-
goto_next_frame
()¶ Continue reading until the next frame is reached
molmod.io.fchk
– Gaussian Formatted Checkpoint Files¶
-
class
molmod.io.fchk.
FCHKFile
(filename, ignore_errors=False, field_labels=None)¶ Reader for Formatted checkpoint files
After initialization, the data from the file is available in the fields dictionary. Also the following attributes are read from the file: title, command, lot (level of theory) and basis.
- Arguments:
filename
– The formatted checkpoint file- Optional arguments:
ignore_errors
– Try to read incorrectly formatted files without raising exceptions [default=False]field_labels
– When given, only these fields are read from the formatted checkpoint file. (This can save a lot of time.)
-
get_hessian
()¶ Return the hessian
-
get_optimization_coordinates
()¶ Return the coordinates of the geometries at each point in the optimization
-
get_optimization_energies
()¶ Return an array with the energy at each point in the optimization
-
get_optimization_gradients
()¶ Return the energy gradients of all geometries during an optimization
-
get_optimization_lowest_index
()¶ Return the index of the lowest energy during the optimization process
-
get_optimized_energy
()¶ Get the final energy after optimization
-
get_optimized_gradient
()¶ Return the energy gradient of the optimized geometry
-
get_optimized_molecule
()¶ Return a molecule object of the optimal geometry
molmod.io.number_state
– Custom Formatted Checkpoint Files¶
-
class
molmod.io.number_state.
NumberState
(owner, names)¶ Component class for data structures with human-readable persistence.
The format used to save and load the object is similar to a formatted checkpoint file from the Gaussian package. Some additional info is stored such as the shape of an array and the exact data type of the array elements.
The attributes that contain data to be read from or to be written to files are set up in the constructor of the owner class. This is a typical simple example:
>>> class Foo(object): ... def __init__(self, a, b): ... self.a = a ... self.b = b ... self.state = NumberState(self, ["a", "b"])
In this example a is an array and b is a single scalar. One can now read/write these attributes to a file as follows:
>>> foo = Foo(a, b) >>> foo.state.dump("somefile.txt") >>> foo.state.load("somefile.txt")
- Arguments:
owner
– the object whose attributes are dumped and loadednames
– a list of attribute names to dump and load
-
dump
(filename)¶ Dump the registered fields to a file
- Argument:
filename
– the file to write to
-
get
(subset=None)¶ Return a dictionary object with the registered fields and their values
- Optional rgument:
subset
– a list of names to restrict the number of fields in the result
-
load
(filename, subset=None)¶ Load data into the registered fields
- Argument:
filename
– the filename to read from- Optional argument:
subset
– a list of field names that are read from the file. If not given, all data is read from the file.
-
set
(new_fields, subset=None)¶ Assign the registered fields based on a dictionary
- Argument:
new_fields
– the dictionary with the data to be assigned to the attributes- Optional argument:
subset
– a list of names to restrict the fields that are effectively overwritten
molmod.io.gromacs
– Gromacs Files¶
-
class
molmod.io.gromacs.
GroReader
(f, sub=slice(None, None, None))¶ A reader from .gro trajectory files from gromacs
Use this reader as an iterator:
>>> gr = GroReader("somefile.gro") >>> for time, pos, vel, cell in gr: ... print pos
- Argument:
f
– a filename or a file-like object- Optional argument:
sub
– a slice object to indicate the frames to be read/skipped.
molmod.io.lammps
– LAMMPS Files¶
-
class
molmod.io.lammps.
LAMMPSDumpReader
(f, units, sub=slice(None, None, None))¶ A Reader for LAMMPS dump files
Use this reader as an iterator:
>>> ldr = LAMMPSDumpReader("some_file.dump", [some, units]) >>> for fields in ldr: ... print fields[0]
- Arguments:
f
– a filename or a file-like objectunits
– The units of the atom fields. The number of fields, their unit and their meaning depends on the input file of the LAMMPS simulation.- Optional argumtent:
sub
– a slice object indicating which time frames to skip/read
molmod.io.pdb
– PDB Files¶
-
molmod.io.pdb.
load_pdb
(filename)¶ Loads a single molecule from a pdb file.
This function does support only a small fragment from the pdb specification. It assumes that there is only one molecular geometry in the pdb file.
-
molmod.io.pdb.
dump_pdb
(filename, molecule, atomnames=None, resnames=None, chain_ids=None, occupancies=None, betas=None)¶ Writes a single molecule to a pdb file.
This function is based on the pdb file specification: http://www.wwpdb.org/documentation/format32/sect9.html For convenience, the relevant table is copied and the character indexes are transformed to C-style (starting from zero)
COLUMNS DATA TYPE FIELD DEFINITION 0 - 5 Record name “ATOM “ 6 - 10 Integer serial Atom serial number. 12 - 15 Atom name Atom name. 16 Character altLoc Alternate location indicator. 17 - 19 Residue name resName Residue name. 21 Character chainID Chain identifier. 22 - 25 Integer resSeq Residue sequence number. 26 AChar iCode Code for insertion of residues. 30 - 37 Real(8.3) x Orthogonal coordinates for X in Angstroms. 38 - 45 Real(8.3) y Orthogonal coordinates for Y in Angstroms. 46 - 53 Real(8.3) z Orthogonal coordinates for Z in Angstroms. 54 - 59 Real(6.2) occupancy Occupancy. 60 - 65 Real(6.2) tempFactor Temperature factor. 76 - 77 LString(2) element Element symbol, right-justified. 78 - 79 LString(2) charge Charge on the atom.
molmod.io.psf
– PSF Files¶
This format is orignally developed in conjunction with the CHARMM program, but is also used by CP2K as a generic format to define the molecular bond graph and other topological aspects of a molecular model. This module just creates files that can be read with CP2K.
Due to the lack of public definition of a PSF file format, several variations exist. Therefore we do not make any attempt to follow the original format strictly, nor one of these variations. It would be more interesting to define a new public standard for molecular topologies, which is also suitable for non-protein systems.
-
class
molmod.io.psf.
PSFFile
(filename=None)¶ A very simplistic and limited implementation of the PSF file format
- Argument:
filename
– When not given, an empty data structure is created, otherwise the file is loaded from disk
-
add_molecular_graph
(molecular_graph, atom_types=None, charges=None, split=True, molecule=None)¶ Add the molecular graph to the data structure
- Argument:
molecular_graph
– a MolecularGraph instance- Optional arguments:
atom_types
– a list with atom type stringscharges
– The net atom chargessplit
– When True, the molecule is split into disconnected molecules [default=True]
-
add_molecule
(molecule, atom_types=None, charges=None, split=True)¶ Add the graph of the molecule to the data structure
The molecular graph is estimated from the molecular geometry based on interatomic distances.
- Argument:
molecule
– a Molecule instance- Optional arguments:
atom_types
– a list with atom type stringscharges
– The net atom chargessplit
– When True, the molecule is split into disconnected molecules [default=True]
-
clear
()¶ Clear the contents of the data structure
-
dump
(f)¶ Dump the data structure to a file-like object
-
get_graph
()¶ Return the bond graph represented by the data structure
-
get_groups
()¶ Return a list of groups of atom indexes
Each atom in a group belongs to the same molecule or residue.
-
get_molecular_graph
()¶ Return the molecular graph represented by the data structure
-
read_from_file
(filename)¶ Load a PSF file
-
write_to_file
(filename)¶ Write the data structure to a file
molmod.io.sdf
– SDF Files¶
For more information about the file format, see http://www.epa.gov/ncct/dsstox/MoreonSDF.html
-
class
molmod.io.sdf.
SDFReader
(f)¶ A basic reader for SDF files.
Use this reader as an iterator:
>>> sr = SDFReader("somefile.sdf") >>> for mol in sr: ... print mol.title
- Argument:
f
– a filename or a file-like object
molmod.io.xyz
– XYZ Files¶
-
class
molmod.io.xyz.
XYZReader
(f, sub=slice(None, None, None), file_unit=1.889726133921252)¶ A reader for XYZ trajectory files
Use this reader as an iterator:
>>> xr = XYZReader("somefile.xyz") >>> for title, coordinates in xr: print title
Initialize an XYZ reader
- Arguments:
f
– a filename or a file-like object- Optional arguments:
sub
– a slice indicating which frames to read/skipfile_unit
– the conversion constant to convert data into atomic units [default=angstrom]- After initialization, the following attributes are defined:
symbols
– The atom symbolsnumbers
– The atom numbers
-
get_first_molecule
()¶ Get the first molecule from the trajectory
This can be useful to configure your program before handeling the actual trajectory.
-
read_size
()¶ Read the number of atoms
-
class
molmod.io.xyz.
XYZWriter
(f, symbols, file_unit=1.889726133921252)¶ A XYZ trajectory file writer
This writer is designed to be used in conjunction with an iterator that generates coordinates in one way or the other. Example:
>>> xr = XYZReader("somefile.xyz") >>> xw = XYZWriter("otherfile.xyz", xr.symbols[5:10]) >>> for title, coordinates in xr: ... xw.dump(title, -coordinates[5:10])
- Arguments:
f
– a filename or a file-like object to write tosymbols
– the atom symbols- Optional argument
file_unit
– the unit of the values written to file [default=angstrom]
-
dump
(title, coordinates)¶ Dump a frame to the trajectory file
- Arguments:
title
– the title of the framecoordinates
– a numpy array with coordinates in atomic units
-
class
molmod.io.xyz.
XYZFile
(f, sub=slice(None, None, None), file_unit=1.889726133921252)¶ Data structure representing an XYZ File
This implementation is extra layer on top the XYZReader and XYZWriter, and is somewhat easier to use. Example:
>>> xyz_file = XYZFile("some.xyz") >>> mol = xyz_file.get_molecule(3) >>> print mol.title >>> xyz_file.geometries[0, 4, 2] = 5.0 # frame 0, atom 4, Z-coordinate >>> xyz_file.write_to_file("other.xyz")
Initialize an XYZFile object
- Argument:
f
– a filename or a file-like object to read from- Optional arguments:
sub
– a slice indicating which frames to read/skipfile_unit
– the conversion constant to convert data into atomic units [default=angstrom]- XYZFile instances always have to following attriubtes:
numbers
– The atom numbers (of one frame)symbols
– The atom symbols (of one frame)titles
– The titles of all the framesgeometries
– A MxNx3 array with all the atom coordinates
-
get_molecule
(index=0)¶ Get a molecule from the trajectory
- Optional argument:
index
– The frame index [default=0]
-
write_to_file
(f, file_unit=1.889726133921252)¶ Write the trajectory to a file
- Argument:
f
– a filename or a file-like object to write to- Optional argument:
file_unit
– the unit of the values written to file [default=angstrom]
molmod.io.common
– Common routines¶
-
molmod.io.common.
slice_match
(sub, counter)¶ Efficiently test if counter is in
xrange(*sub)
- Arguments:
sub
– a slice objectcounter
– an integer
The function returns True if the counter is in
xrange(sub.start, sub.stop, sub.step)
.
-
exception
molmod.io.common.
FileFormatError
¶ Is raised when unexpected data is encountered while reading a file
-
class
molmod.io.common.
SlicedReader
(f, sub=slice(None, None, None))¶ Base class for readers that can read a slice of all the frames
- Argument:
f
– a filename or a file-like object- Optional argument:
sub
– a slice indicating which frames to read/skip