Basic classes¶
molmod.molecules
– Molecular systems¶
Although the name of the class Molecule
is singular, it is equally
suitable for a cluster or complex with multiple molecules, or even a complete
unit cell of a periodic system.
-
class
molmod.molecules.
Molecule
(numbers, coordinates=None, title=None, masses=None, graph=None, symbols=None, unit_cell=None)¶ Extensible class for molecular systems.
Most attributes of the molecule object are treated as constants. If you want to modify the molecular geometry, just create a modified molecule object with the method
molmod.utils.ReadOnly.copy_with()
. This facilitates the caching of derived quantities such as the distance matrix, while it imposes a cleaner coding style without a signifacant computational overhead.- Mandatory arguments:
numbers
– numpy array (1D, N elements) with the atomic numbers- Optional keyword arguments:
coordinates
– numpy array (2D, Nx3 elements) Cartesian coordinatestitle
– a string with the name of the moleculemassess
– a numpy array with atomic masses in atomic unitsgraph
– a MolecularGraph instancesymbols
– atomic elements or force-field atom-typesunit_cell
– the unit cell in case the system is periodic
-
chemical_formula
¶ Cached attribute: the chemical formula of the molecule.
-
com
¶ Cached attribute: the center of mass of the molecule.
-
compute_rotsym
(threshold=0.0018897261339212521)¶ Compute the rotational symmetry number.
- Optional argument:
threshold
– only when a rotation results in an rmsd below the given threshold, the rotation is considered to transform the molecule onto itself.
-
coordinates
¶ Read-only attribute: atomic Cartesian coordinates.
The attribute must satisfy the following conditions:
- Must be an instance of
<type 'numpy.ndarray'>
- Must have dimension 2.
- Must have shape (?, 3).
- Must have dtype
<type 'numpy.floating'>
. - Special conditions: the number of rows must be the same as the length of the array numbers.
- Must be an instance of
-
distance_matrix
¶ Cached attribute: the matrix with all atom pair distances.
-
classmethod
from_file
(filename)¶ Construct a molecule object read from the given file.
The file format is inferred from the extensions. Currently supported formats are:
*.cml
,*.fchk
,*.pdb
,*.sdf
,*.xyz
If a file contains more than one molecule, only the first one is read.
- Argument:
filename
– the name of the file containing the molecule
Example usage:
>>> mol = Molecule.from_file("foo.xyz")
-
graph
¶ Read-only attribute: the molecular graph with the atom connectivity.
The attribute must satisfy the following conditions:
- Must be an instance of
<class 'molmod.molecular_graphs.MolecularGraph'>
- Special conditions: the atomic numbers must match.
- Must be an instance of
-
inertia_tensor
¶ Cached attribute: the intertia tensor of the molecule.
-
mass
¶ Cached attribute: the total mass of the molecule.
-
masses
¶ Read-only attribute: the atomic masses.
The attribute must satisfy the following conditions:
- Must be an instance of
<type 'numpy.ndarray'>
- Must have dimension 1.
- Must have dtype
<type 'numpy.floating'>
. - Special conditions: the size must be the same as the length of the array numbers.
- Must be an instance of
-
numbers
¶ Read-only attribute: the atomic numbers.
The attribute must satisfy the following conditions:
- May not be None.
- Must be an instance of
<type 'numpy.ndarray'>
- Must have dimension 1.
- Must have dtype
<type 'numpy.signedinteger'>
.
-
rmsd
(other)¶ Compute the RMSD between two molecules.
- Arguments:
other
– Another molecule with the same atom numbers- Return values:
transformation
– the transformation that brings ‘self’ into overlap with ‘other’other_trans
– the transformed coordinates of geometry ‘other’rmsd
– the rmsd of the distances between corresponding atoms in ‘self’ and ‘other’
Make sure the atoms in self and other are in the same order.
Usage:
>>> print molecule1.rmsd(molecule2)[2]/angstrom
-
set_default_graph
()¶ Set self.graph to the default graph.
This method is equivalent to:
mol.graph = MolecularGraph.from_geometry(mol)
with the default options, and only works if the graph object is not present yet. See
molmod.molecular_graphs.MolecularGraph.from_geometry()
for more fine-grained control over the assignment of bonds.
-
set_default_masses
()¶ Set self.masses based on self.numbers and periodic table.
-
set_default_symbols
()¶ Set self.symbols based on self.numbers and the periodic table.
-
size
¶ Read-only attribute: the number of atoms.
-
symbols
¶ Read-only attribute: symbols for the atoms, which can be element names for force-field atom types.
The attribute must satisfy the following conditions:
- Must be an instance of
<type 'tuple'>
- Must be an instance of
-
title
¶ Read-only attribute: a short description of the system.
The attribute must satisfy the following conditions:
- Must be an instance of
<type 'str'>
- Must be an instance of
-
unit_cell
¶ Read-only attribute: description of the periodic boundary conditions.
The attribute must satisfy the following conditions:
- Must be an instance of
<class 'molmod.unit_cells.UnitCell'>
- Must be an instance of
-
write_to_file
(filename)¶ Write the molecular geometry to a file.
The file format is inferred from the extensions. Currently supported formats are:
*.xyz
,*.cml
- Argument:
filename
– a filename
molmod.graphs
– Abstract graphs¶
Some distinctive features include:
- Iterating over all shortest paths between two vertices (Dijkstra). See http://en.wikipedia.org/wiki/Dijkstra’s_algorithm for more info.
- Iterating over vertices or edges using the Breadth First convention. See http://en.wikipedia.org/wiki/Breadth-first_search for more info.
- The all pairs shortest path matrix (Floyd-Warshall). See http://en.wikipedia.org/wiki/Floyd-Warshall_algorithm for more info.
- Symmetry analysis of graphs (automorphisms). The Graph class can generate a list of permutations between vertices that map the graph onto itself. This can be used to generate (and test) all possible geometric symmetries in a molecule. See http://en.wikipedia.org/wiki/Graph_automorphism for more info.
- Scanning a graph for patterns. The GraphSearch is a generic class that can scan a graph for certain patterns, e.g. given pattern_graphs, strong rings, isomorphisms, automorphisms, … The pattern_graph can deal with (multiple sets of) additional conditions that must be satisfied, such as “give me all dihedral angles where the central atoms are carbons” without duplicates.
The central class in this module is ‘Graph’. It caches most of the analysis results, which implies that the graph structure can not be changed once the object is created. If you feel the need to do this, just construct a new graph object.
-
exception
molmod.graphs.
GraphError
¶ Raised when something goes wrong in one of the graph algorithms
-
class
molmod.graphs.
Graph
(edges, num_vertices=None)¶ An undirected graph, where edges have equal weight
The edges attribute is the most elementary and is always available. All other attributes are optional.
Graphs are meant to be immutable objects. Once created they are not supposed to be modified. If you want an adapted graph, create a new object. The main reason is that all graph analysis work is cached in this object. When the edges change, the cache becomes invalid and should be erased. The latter is not supported as it is easier to create just a new graph object with the modified edges.
If you want to associate data with vertices or edges, add custom dictionary or list attributes to the graph object. This is the way to work attributes that are not applicable to all vertices or edges:
>>> graph.vertex_property = {vertex1: blablabla, vertex2: blablabla, ...} >>> graph.edge_property = {frozenset([vertex1, vertex2]): blablabla, ..}
If a certain property applies to all vertices or edges, it is sometimes more practical to work with lists or
numpy
arrays that have the same ordering as the vertices or the edges:>>> # atom numbers of ethene >>> graph.vertex_property = np.array([6, 6, 1, 1, 1, 1], int) >>> # bond orders of ethene >>> graph.edge_property = np.array([2, 1, 1, 1, 1], int)
- Arguments:
edges
–tuple(frozenset([vertex1, vertex2]), ...)
num_vertices
– number of vertices
vertex1 to vertexN must be integers from 0 to N-1. If vertices above the highest vertex value are not connected by edges, use the num_vertices argument to tell what the total number of vertices is.
If the edges argument does not have the correct format, it will be converted.
-
canonical_order
¶ Cached attribute: The vertices in a canonical or normalized order..
This routine will return a list of vertices in an order that does not depend on the initial order, but only depends on the connectivity and the return values of the function self.get_vertex_string.
Only the vertices that are involved in edges will be included. The result can be given as first argument to self.get_subgraph, with reduce=True as second argument. This will return a complete canonical graph.
The routine is designed not to use symmetry relations that are obtained with the GraphSearch routine. We also tried to create an ordering that feels like natural, i.e. starting in the center and pushing vertices with few equivalents to the front. If necessary, the nature of the vertices and their bonds to atoms closer to the center will also play a role, but only as a last resort.
-
central_vertex
¶ Cached attribute: The vertex that has the lowest maximum distance to any other vertex.
This definition does not lead to a unique solution. One arbitrary solution is selected.
-
central_vertices
¶ Cached attribute: Vertices that have the lowest maximum distance to any other vertex.
-
distances
¶ Cached attribute: The matrix with the all-pairs shortest path lenghts.
-
edge_index
¶ Cached attribute: A map to look up the index of a edge.
-
edges
¶ Read-only attribute: the incidence list.
The attribute must satisfy the following conditions:
- May not be None.
- Must be an instance of
<type 'tuple'>
-
equivalent_vertices
¶ Cached attribute: A dictionary with symmetrically equivalent vertices..
-
fingerprint
¶ Cached attribute: A total graph fingerprint.
The result is invariant under permutation of the vertex indexes. The chance that two different (molecular) graphs yield the same fingerprint is small but not zero. (See unit tests.)
-
full_match
(other)¶ Find the mapping between vertex indexes in self and other.
This also works on disconnected graphs. Derived classes should just implement get_vertex_string and get_edge_string to make this method aware of the different nature of certain vertices. In case molecules, this would make the algorithm sensitive to atom numbers etc.
-
get_edge_string
(i)¶ Returns a string that fully characterizes the nature of an edge
In case of an ordinary graph, all edges are the same.
-
get_halfs
(vertex1, vertex2)¶ Split the graph in two halfs by cutting the edge: vertex1-vertex2
If this is not possible (due to loops connecting both ends), a GraphError is raised.
Returns the vertices in both halfs.
-
get_halfs_double
(vertex_a1, vertex_b1, vertex_a2, vertex_b2)¶ Compute the two parts separated by
(vertex_a1, vertex_b1)
and(vertex_a2, vertex_b2)
Raise a GraphError when
(vertex_a1, vertex_b1)
and(vertex_a2, vertex_b2)
do not separate the graph in two disconnected parts. The edges must be neighbors. If not a GraphError is raised. The for vertices must not coincide or a GraphError is raised.Returns the vertices of the two halfs and the four ‘hinge’ vertices in the correct order, i.e. both
vertex_a1
andvertex_a2
are in the first half and bothvertex_b1
andvertex_b2
are in the second half.
-
get_part
(vertex_in, vertices_border)¶ List all vertices that are connected to vertex_in, but are not included in or ‘behind’ vertices_border.
-
get_subgraph
(subvertices, normalize=False)¶ Constructs a subgraph of the current graph
- Arguments:
subvertices
– The vertices that should be retained.normalize
– Whether or not the vertices should renumbered and reduced to the given set of subvertices. When True, also the edges are sorted. It the end, this means that new order of the edges does not depend on the original order, but only on the order of the argument subvertices. This option is False by default. When False, only edges will be discarded, but the retained data remain unchanged. Also the parameter num_vertices is not affected.
The returned graph will have an attribute
old_edge_indexes
that relates the positions of the new and the old edges as follows:>>> self.edges[result._old_edge_indexes[i]] = result.edges[i]
In derived classes, the following should be supported:
>>> self.edge_property[result._old_edge_indexes[i]] = result.edge_property[i]
When
normalize==True
, also the vertices are affected and the derived classes should make sure that the following works:>>> self.vertex_property[result._old_vertex_indexes[i]] = result.vertex_property[i]
The attribute
old_vertex_indexes
is only constructed whennormalize==True
.
-
get_vertex_fingerprints
(vertex_strings, edge_strings, num_iter=None)¶ Return an array with fingerprints for each vertex
-
get_vertex_string
(i)¶ Returns a string that fully characterizes the nature of the vertex
In case of an ordinary graph, all vertices are the same.
-
independent_vertices
¶ Cached attribute: Lists of vertices that are only interconnected within each list.
This means that there is no path from a vertex in one list to a vertex in another list. In case of a molecular graph, this would yield the atoms that belong to individual molecules.
-
iter_breadth_first
(start=None, do_paths=False, do_duplicates=False)¶ Iterate over the vertices with the breadth first algorithm.
See http://en.wikipedia.org/wiki/Breadth-first_search for more info. If not start vertex is given, the central vertex is taken.
By default, the distance to the starting vertex is also computed. If the path to the starting vertex should be computed instead, set path to True.
When duplicate is True, then vertices that can be reached through different paths of equal length, will be iterated twice. This typically only makes sense when path==True.
-
iter_breadth_first_edges
(start=None)¶ Iterate over the edges with the breadth first convention.
We need this for the pattern matching algorithms, but a quick look at Wikipedia did not result in a known and named algorithm.
The edges are yielded one by one, together with the distance of the edge from the starting vertex and a flag that indicates whether the yielded edge connects two vertices that are at the same distance from the starting vertex. If that flag is False, the distance from the starting vertex to edge[0] is equal to the distance variable and the distance from edge[1] to the starting vertex is equal to distance+1. One item has the following format: ((i, j), distance, flag)
-
iter_shortest_paths
(a, b)¶ Iterate over all the shortest paths between vertex a and b.
-
max_distance
¶ Cached attribute: The maximum value in the distances matrix..
-
neighbors
¶ Cached attribute: A dictionary with neighbors.
The dictionary will have the following form:
{vertexX: (vertexY1, vertexY2, ...), ...}
This means that vertexX and vertexY1 are connected etc. This also implies that the following elements are part of the dictionary:{vertexY1: (vertexX, ...), vertexY2: (vertexX, ...), ...}
.
-
num_edges
¶ Read-only attribute: the number of edges in the graph.
-
num_vertices
¶ Read-only attribute: the number of vertices.
The attribute must satisfy the following conditions:
- May not be None.
- Must be an instance of
<type 'int'>
-
symmetries
¶ Cached attribute: Graph symmetries (permutations) that map the graph onto itself..
-
symmetry_cycles
¶ Cached attribute: The cycle representations of the graph symmetries.
-
vertex_fingerprints
¶ Cached attribute: A fingerprint for each vertex.
The result is invariant under permutation of the vertex indexes. Vertices that are symmetrically equivalent will get the same fingerprint, e.g. the hydrogens in methane would get the same fingerprint.
-
class
molmod.graphs.
OneToOne
(relations=None)¶ Implements a discrete bijection between source and destination elements
The implementation is based on a consistent set of forward and reverse relations, stored in dictionaries.
- Argument:
relations
– initial relations for the bijection
-
add_relation
(source, destination)¶ Add new a relation to the bejection
-
add_relations
(relations)¶ Add multiple relations to a bijection
-
get_destination
(source)¶ Get the end point of a relation that start with ‘source’
-
get_source
(destination)¶ Get the starting point of a relation that ends with ‘destination’
-
in_destinations
(destination)¶ Test if the given destination is present
-
in_sources
(source)¶ Test if the given source is present
-
inverse
()¶ Returns the inverse bijection.
-
class
molmod.graphs.
Match
(relations=None)¶ A match between a pattern and a graph
- Argument:
relations
– initial relations for the bijection
-
copy_with_new_relations
(new_relations)¶ Create a new match object extended with new relations
-
classmethod
from_first_relation
(vertex0, vertex1)¶ Intialize a fresh match based on the first relation
-
get_new_edges
(subject_graph)¶ Get new edges from the subject graph for the graph search algorithm
The Graph search algorithm extends the matches iteratively by adding matching vertices that are one edge further from the starting vertex at each iteration.
-
class
molmod.graphs.
Pattern
¶ Base class for a pattern in a graph.
Note the following conventions:
- A pattern can always be represented by a graph (or a set of graphs) and some additional conditions. This graph is the so called ‘PATTERN GRAPH’. For technical reasons, this pattern graph is not always constructed explicitly. Variables related to this graph often get suffix ‘0’. Note that a pattern graph is always fully connected.
- The graph in which we search for the pattern, is called the ‘SUBJECT GRAPH’. Variables related to this graph often get suffix ‘1’.
-
check_next_match
(match, new_relations, subject_graph, one_match)¶ Does this match object make sense for the current pattern
Return False if some symmetry or other considerations are not satisfied. The checks in this function are typically only possible by considering the whole instead of looking just at a few vertices/edges/relations.
-
check_symmetry
(new_relations, current_match, next_match)¶ Off all symmetric
new_relations
, only allow one
-
compare
(vertex0, vertex1, subject_graph)¶ Test if
vertex0
andvertex1
can be equalFalse positives are allowed, but the less false positives, the more efficient the
GraphSearch
will be.
-
complete
(match, subject_graph)¶ Returns
True
if no more additional relations are required
-
get_new_edges
(level)¶ Get new edges from the pattern graph for the graph search algorithm
The level argument denotes the distance of the new edges from the starting vertex in the pattern graph.
-
iter_final_matches
(match, subject_graph, one_match)¶ Just return the original match
Derived classes can specialize here to make efficient use of symmetries
-
iter_initial_relations
(subject_graph)¶ Iterate over all initial relations to start a Graph Search
The function iterates of single relations between a pattern vertex and a subject vertex. In practice it is sufficient to select one vertex in the pattern and then associate it with each (reasonable) vertex in the subject graph.
-
sub
= True¶
-
class
molmod.graphs.
CriteriaSet
(vertex_criteria=None, edge_criteria=None, **kwargs)¶ A set of criteria that can be associated with a custum pattern.
- Arguments:
vertex_criteria
– a dictionary with criteria for the vertices,key=vertex_index
,value=criterion
edge_criteria
– a dictionary with criteria for the edgeskey=edge_index
,value=criterion
Any other keyword argument will be assigned as attribute to matches that fulfill the criteria of this set.
-
test_match
(match, pattern_graph, subject_graph)¶ Test if a match satisfies the criteria
-
class
molmod.graphs.
Anything
¶ A criterion that always returns True
-
__call__
(index, subject_graph)¶ Always returns True
-
-
class
molmod.graphs.
CritOr
(*criteria)¶ OR Operator for criteria objects
- Argument:
criteria
– a list of criteria to apply the OR operation to.
-
__call__
(index, graph)¶ Evaluates all the criteria and applies an OR opartion
- Arguments:
index
– the index of the vertex/edge on which the criterion is appliedgraph
– the graph on which the criterion is tested
-
class
molmod.graphs.
CritAnd
(*criteria)¶ AND Operator for criteria objects
- Argument:
criteria
– a list of criteria to apply the AND operation to
-
__call__
(index, graph)¶ Evaluates all the criteria and applies an AND opartion
- Arguments:
index
– the index of the vertex/edge on which the criterion is appliedgraph
– the graph on which the criterion is tested
-
class
molmod.graphs.
CritXor
(*criteria)¶ XOR Operator for criteria objects
- Argument:
criteria
– a list of criteria to apply the XOR operation to.
-
__call__
(index, graph)¶ Evaluates all the criteria and applies a generalized XOR opartion
- Arguments:
index
– the index of the vertex/edge on which the criterion is appliedgraph
– the graph on which the criterion is tested
when the XOR operation is applied to more than two criteria, True is only returned when an odd number of criteria return True.
-
class
molmod.graphs.
CritNot
(criterion)¶ Inverion of another criterion
- Argument:
criterion
– another criterion object
-
__call__
(index, graph)¶ Evaluates all the criterion and applies an inversion opartion
- Arguments:
index
– the index of the vertex/edge on which the criterion is appliedgraph
– the graph on which the criterion is tested
-
class
molmod.graphs.
CustomPattern
(pattern_graph, criteria_sets=None, vertex_tags=None, start_vertex=None)¶ A pattern based on a given pattern graph
The pattern graph can be complemented with additional criteria for the vertices and edges. Additionally the effective symmetry of the pattern graph can be reduced by tagging the vertices in the pattern graph with different labels.
- Arguments:
pattern_graph
– the pattern that has to be found in the subject graph.criteria_sets
– Criteria sets associate additional conditions with vertices and edges, and can also introduce global match conditions.vertex_tags
– vertex tags can reduce the symmetry of the pattern_graph pattern. An example case where this is useful: Consider atoms 0, 1, 2 that are bonded in this order. We want to compute the distance from atom 2 to the line (0, 1). In this case the matches (0->a, 1->b, 2->c) and (0->c, 1->b, 2->a) correspond to different internal coordinates. We want the graph search to return the two solutions. In order to do this, setvertex_tags={0:0, 1:0, 2:1}
. This means that vertex 0 and 1 are equivalent, but that vertex 2 has a different nature. In the case of a bending angle, only one match like (0->a, 1->b, 2->c) is sufficient and we do not want to reduce the symmetry of thepattern_graph
. In this case, one should not use vertex_tags at all.start_vertex
– The first vertex in the pattern graph that is linked with a vertex in the subject graph. A wise choice can improve the performance of a graph search. If not given, the central vertex is take as start_vertex
-
check_next_match
(match, new_relations, subject_graph, one_match)¶ Check if the (onset for a) match can be a valid
-
complete
(match, subject_graph)¶ Return True of the match is complete
-
get_new_edges
(level)¶ Get new edges from the pattern graph for the graph search algorithm
The level argument denotes the distance of the new edges from the starting vertex in the pattern graph.
-
iter_final_matches
(canonical_match, subject_graph, one_match)¶ Given a match, iterate over all related equivalent matches
When criteria sets are defined, the iterator runs over all symmetric equivalent matches that fulfill one of the criteria sets. When not criteria sets are defined, the iterator only yields the input match.
-
iter_initial_relations
(subject_graph)¶ Iterate over all valid initial relations for a match
-
class
molmod.graphs.
EqualPattern
(pattern_graph)¶ Like CustomPattern, but the pattern has the same size as the subject graph
See
CustomPattern.__init__()
-
MatchClass
¶ alias of
EqualMatch
-
compare
(vertex0, vertex1, subject_graph)¶ Returns true when the two vertices are of the same kind
-
iter_initial_relations
(subject_graph)¶ Iterate over all valid initial relations for a match
-
sub
= False¶
-
-
class
molmod.graphs.
RingPattern
(max_size)¶ A pattern that matches strong rings up to a given size
- Argument:
max_size
– the maximum number of vertices in a ring
-
check_next_match
(match, new_relations, subject_graph, one_match)¶ Check if the (onset for a) match can be a valid (part of a) ring
-
complete
(match, subject_graph)¶ Check the completeness of a ring match
-
get_new_edges
(level)¶ Get new edges from the pattern graph for the graph search algorithm
The level argument denotes the distance of the new edges from the starting vertex in the pattern graph.
-
iter_initial_relations
(subject_graph)¶ Iterate over all valid initial relations for a match
-
class
molmod.graphs.
GraphSearch
(pattern, debug=False)¶ An algorithm that searches for all matches of a pattern in a graph
Usage:
>>> gs = GraphSearch(pattern) >>> for match in gs(graph): ... print match.forward
- Arguments:
pattern
– A Pattern instance, describing the pattern to look fordebug
– When true, debugging info is printed on screen [default=False]
-
__call__
(subject_graph, one_match=False)¶ Iterator over all matches of self.pattern in the given graph.
- Arguments:
- subject_graph – The subject_graph in which the matches according to self.pattern have to be found.one_match – If True, only one match will be returned. This allows certain optimizations.
-
print_debug
(text, indent=0)¶ Only prints debug info on screen when self.debug == True.
molmod.molecular_graphs
– Molecular graphs¶
This is an extension of the Graph object with molecular features.
-
class
molmod.molecular_graphs.
MolecularGraph
(edges, numbers, orders=None, symbols=None, num_vertices=None)¶ Describes a molecular graph: connectivity, atom numbers and bond orders.
This class inherits all features from the Graph class and adds methods and attributes that are specific from molecular graphs. Instances are immutable, so if you want to modify a molecular graph, just instantiate a new object with modified connectivity, numbers and orders. The advantage is that various graph analysis and properties can be cached.
- Arguments:
edges
– See base class (Graph) documentationnumbers
– consecutive atom numbers- Optional arguments:
orders
– bond orderssymbols
– atomic symbolsnum_vertices
– must be the same as the number of atoms or None
When the nature of an atom or a bond is unclear ambiguous, set the corresponding integer to zero. This means the nature of the atom or bond is unspecified. When the bond orders are not given, they are all set to zero.
If you want to use ‘special’ atom types, use negative numbers. The same for bond orders. e.g. a nice choice for the bond order of a hybrid bond is -1.
-
add_hydrogens
(formal_charges=None)¶ Returns a molecular graph where hydrogens are added explicitely
When the bond order is unknown, it assumes bond order one. If the graph has an attribute formal_charges, this routine will take it into account when counting the number of hydrogens to be added. The returned graph will also have a formal_charges attribute.
This routine only adds hydrogen atoms for a limited set of atoms from the periodic system: B, C, N, O, F, Al, Si, P, S, Cl, Br.
-
blob
¶ Cached attribute: A compact text representation of the graph.
-
edges
¶ Read-only attribute: the incidence list.
The attribute must satisfy the following conditions:
- May not be None.
- Must be an instance of
<type 'tuple'>
-
classmethod
from_blob
(s)¶ Construct a molecular graph from the blob representation
-
classmethod
from_geometry
(molecule, do_orders=False, scaling=1.0)¶ Construct a MolecularGraph object based on interatomic distances
All short distances are computed with the binning module and compared with a database of bond lengths. Based on this comparison, bonded atoms are detected.
Before marking a pair of atoms A and B as bonded, it is also checked that there is no third atom C somewhat between A and B. When an atom C exists that is closer to B (than A) and the angle A-B-C is less than 45 degrees, atoms A and B are not bonded. Similarly if C is closer to A (than B) and the angle B-A-C is less then 45 degrees, A and B are not connected.
- Argument:
molecule
– The molecule to derive the graph from- Optional arguments:
do_orders
– set to True to estimate the bond orderscaling
– scale the threshold for the connectivity. increase this to 1.5 in case of transition states when a fully connected topology is required.
-
get_edge_string
(i)¶ Return a string based on the bond order
-
get_subgraph
(subvertices, normalize=False)¶ Creates a subgraph of the current graph
See
molmod.graphs.Graph.get_subgraph()
for more information.
-
get_vertex_string
(i)¶ Return a string based on the atom number
-
num_vertices
¶ Read-only attribute: the number of vertices.
The attribute must satisfy the following conditions:
- May not be None.
- Must be an instance of
<type 'int'>
-
numbers
¶ Read-only attribute: the atomic numbers associated with the vertices.
The attribute must satisfy the following conditions:
- May not be None.
- Must be an instance of
<type 'numpy.ndarray'>
- Must have dimension 1.
- Must have dtype
<type 'numpy.signedinteger'>
. - Special conditions: the number of vertices and atomic numbers must be the same.
-
orders
¶ Read-only attribute: the bond orders associated with the edges.
The attribute must satisfy the following conditions:
- May not be None.
- Must be an instance of
<type 'numpy.ndarray'>
- Must have dimension 1.
- Must have dtype
<type 'numpy.floating'>
. - Special conditions: the number of edges and bond orders must be the same.
-
symbols
¶ Read-only attribute: symbols for the atoms, which can be element names for force-field atom types.
The attribute must satisfy the following conditions:
- Must be an instance of
<type 'tuple'>
- Must be an instance of
-
class
molmod.molecular_graphs.
HasAtomNumber
(number)¶ Criterion for the atom number of a vertex
- Arguments:
number
– the expected atom number
-
__call__
(index, graph)¶ Return True only if the atom number is correct
- Arguments:
index
– the index of the vertex/edge on which the criterion is appliedgraph
– the graph on which the criterion is tested
-
class
molmod.molecular_graphs.
HasNumNeighbors
(count)¶ Criterion for the number of neighboring vertexes
- Arguments:
count
– the expected number of neighbors
-
__call__
(index, graph)¶ Return True only if the number of neighbors is correct
- Arguments:
index
– the index of the vertex/edge on which the criterion is appliedgraph
– the graph on which the criterion is tested
-
class
molmod.molecular_graphs.
HasNeighborNumbers
(*numbers)¶ Criterion for the atom numbers of the neighbor vertexes
- Arguments:
*numbers
– a list with atom numbers
-
__call__
(index, graph)¶ Return True only if each neighbor can be linked with an atom number
- Arguments:
index
– the index of the vertex/edge on which the criterion is appliedgraph
– the graph on which the criterion is tested
-
class
molmod.molecular_graphs.
HasNeighbors
(*neighbor_criteria)¶ Tests if the neighbors of a vertex match the given criteria
- Arguments:
*neighbor_criteria
– a list of criteria objects
-
__call__
(index, graph)¶ Return True only if each neighbor can be linked with a positive criterion
- Arguments:
index
– the index of the vertex/edge on which the criterion is appliedgraph
– the graph on which the criterion is tested
-
class
molmod.molecular_graphs.
BondLongerThan
(length)¶ A vertex criterion to select bonds longer than a given threshold
This criterion assumes that the molecular graph has an attribute self.bond_lengths
- Argument:
length
– the minimum length of the bond
-
__call__
(index, graph)¶ Return True only if the bond is longer than the threshold
- Arguments:
index
– the index of the vertex/edge on which the criterion is appliedgraph
– the graph on which the criterion is tested
-
molmod.molecular_graphs.
atom_criteria
(*params)¶ An auxiliary function to construct a dictionary of Criteria
-
class
molmod.molecular_graphs.
BondPattern
(criteria_sets=None, vertex_tags=None)¶ Pattern for two consecutive vertices
Arguments: see
molmod.graphs.CustomPattern
-
class
molmod.molecular_graphs.
BendingAnglePattern
(criteria_sets=None, vertex_tags=None)¶ Pattern for three consecutive vertices
Arguments: see
molmod.graphs.CustomPattern
-
class
molmod.molecular_graphs.
DihedralAnglePattern
(criteria_sets=None, vertex_tags=None)¶ Pattern for four consecutive vertices
Arguments: see
molmod.graphs.CustomPattern
-
class
molmod.molecular_graphs.
OutOfPlanePattern
(criteria_sets=None, vertex_tags=None)¶ Pattern for a central vertex connected to three other vertices
Arguments: see
molmod.graphs.CustomPattern
-
class
molmod.molecular_graphs.
TetraPattern
(criteria_sets=None, vertex_tags=None)¶ Pattern for a central vertex connected to four other vertices
Arguments: see
molmod.graphs.CustomPattern
-
class
molmod.molecular_graphs.
NRingPattern
(size, criteria_sets=None, vertex_tags=None, strong=False)¶ Pattern for strong rings with a fixed size
- Argument:
size
– the size of the ring
-
check_next_match
(match, new_relations, subject_graph, one_match)¶ Check if the (onset for a) match can be a valid (part of a) ring
-
complete
(match, subject_graph)¶ Check the completeness of the ring match
molmod.unit_cells
– Periodic boundary conditions¶
-
class
molmod.unit_cells.
UnitCell
(matrix, active=None)¶ Extensible representation of a unit cell.
Most attributes of the UnitCell object are treated as constants. If you want to modify the unit cell, just create a modified UnitCell object. This facilitates the caching of derived quantities such as the distance matrices, while it imposes a cleaner coding style without a significant computational overhead.
- Argument:
matrix
– the array with cell vectors. each column corresponds to a single cell vector.- Optional argument:
active
– an array with three boolean values indicating which cell vectors are active. [default: all three True]
-
active
¶ Read-only attribute: the active cell vectors.
The attribute must satisfy the following conditions:
- May not be None.
- Must be an instance of
<type 'numpy.ndarray'>
- Must have dimension 1.
- Must have shape (3).
- Must have dtype
<type 'numpy.bool_'>
.
-
active_inactive
¶ Cached attribute: The indexes of the active and the inactive cell vectors.
-
add_cell_vector
(vector)¶ Returns a new unit cell with an additional cell vector
-
alignment_a
¶ Cached attribute: Computes the rotation matrix that aligns the unit cell with the. Cartesian axes, starting with cell vector a.
- a parallel to x
- b in xy-plane with b_y positive
- c with c_z positive
-
alignment_c
¶ Cached attribute: Computes the rotation matrix that aligns the unit cell with the. Cartesian axes, starting with cell vector c.
- c parallel to z
- b in zy-plane with b_y positive
- a with a_x positive
-
eps
= 1e-06¶
-
classmethod
from_parameters3
(lengths, angles)¶ Construct a 3D unit cell with the given parameters
The a vector is always parallel with the x-axis and they point in the same direction. The b vector is always in the xy plane and points towards the positive y-direction. The c vector points towards the positive z-direction.
-
get_radius_indexes
(radius, max_ranges=None)¶ Return the indexes of the interacting neighboring unit cells
Interacting neighboring unit cells have at least one point in their box volume that has a distance smaller or equal than radius to at least one point in the central cell. This concept is of importance when computing pair wise long-range interactions in periodic systems.
- Argument:
radius
– the radius of the interaction sphere- Optional argument:
max_ranges
– numpy array with three elements: The maximum ranges of indexes to consider. This is practical when working with the minimum image convention to reduce the generated bins to the minimum image. (see binning.py) Use -1 to avoid such limitations. The default is three times -1.
-
get_radius_ranges
(radius, mic=False)¶ Return ranges of indexes of the interacting neighboring unit cells
Interacting neighboring unit cells have at least one point in their box volume that has a distance smaller or equal than radius to at least one point in the central cell. This concept is of importance when computing pair wise long-range interactions in periodic systems.
The mic (stands for minimum image convention) option can be used to change the behavior of this routine such that only neighboring cells are considered that have at least one point withing a distance below radius from the center of the reference cell.
-
matrix
¶ Read-only attribute: matrix whose columns are the primitive cell vectors.
The attribute must satisfy the following conditions:
- May not be None.
- Must be an instance of
<type 'numpy.ndarray'>
- Must have dimension 2.
- Must have shape (3, 3).
- Must have dtype
<type 'numpy.floating'>
.
-
ordered
¶ Cached attribute: An equivalent unit cell with the active cell vectors coming first.
-
parameters
¶ Cached attribute: The cell parameters (lengths and angles).
-
reciprocal
¶ Cached attribute: The reciprocal of the unit cell.
In case of a three-dimensional periodic system, this is trivially the transpose of the inverse of the cell matrix. This means that each column of the matrix corresponds to a reciprocal cell vector. In case of lower-dimensional periodicity, the inactive columns are zero, and the active columns span the same sub space as the original cell vectors.
-
shortest_vector
(delta)¶ Compute the relative vector under periodic boundary conditions.
- Argument:
delta
– the relative vector between two points
The return value is not necessarily the shortest possible vector, but instead is the vector with fractional coordinates in the range [-0.5,0.5[. This is most of the times the shortest vector between the two points, but not always. (See commented test.) It is always the shortest vector for orthorombic cells.
-
spacings
¶ Cached attribute: Computes the distances between neighboring crystal planes.
-
to_cartesian
(fractional)¶ Converts fractional to Cartesian coordinates
- Argument:
fractional
– Can be a numpy array with shape (3, ) or with shape (N, 3).
The return value has the same shape as the argument. This function is the inverse of to_fractional.
-
to_fractional
(cartesian)¶ Convert Cartesian to fractional coordinates
- Argument:
cartesian
– Can be a numpy array with shape (3, ) or with shape (N, 3).
The return value has the same shape as the argument. This function is the inverse of to_cartesian.
-
volume
¶ Cached attribute: The volume of the unit cell.
The actual definition of the volume depends on the number of active directions:
- num_active == 0 – always -1
- num_active == 1 – length of the cell vector
- num_active == 2 – surface of the parallelogram
- num_active == 3 – volume of the parallelepiped
molmod.transformations
Transformations in 3D¶
In addition to Translation, Rotation and Complete classes, two utility functions are provided: rotation_around_center and superpose. The latter is an implementation of the Kabsch algorithm.
-
class
molmod.transformations.
Translation
(t)¶ Represents a translation in 3D
The attribute t contains the actual translation vector, which is a numpy array with three elements.
- Argument:
t
– translation vector, a list-like object with three numbers
-
apply_to
(x, columns=False)¶ Apply this translation to the given object
The argument can be several sorts of objects:
np.array
with shape (3, )np.array
with shape (N, 3)np.array
with shape (3, N), usecolumns=True
Translation
Rotation
Complete
UnitCell
In case of arrays, the 3D vectors are translated. In case of trans- formations, a new transformation is returned that consists of this translation applied AFTER the given translation. In case of a unit cell, the original object is returned.
This method is equivalent to
self*x
.
-
compare
(other, t_threshold=0.001)¶ Compare two translations
The RMSD of the translation vectors is computed. The return value is True when the RMSD is below the threshold, i.e. when the two translations are almost identical.
-
classmethod
from_matrix
(m)¶ Initialize a translation from a 4x4 matrix
-
classmethod
identity
()¶ Return the identity transformation
-
inv
¶ Cached attribute: The inverse translation.
-
matrix
¶ Cached attribute: The 4x4 matrix representation of this translation.
-
t
¶ Read-only attribute: the translation vector.
The attribute must satisfy the following conditions:
- May not be None.
- Must be an instance of
<type 'numpy.ndarray'>
- Must have dimension 1.
- Must have shape (3).
- Must have dtype
<type 'numpy.floating'>
.
-
class
molmod.transformations.
Rotation
(r)¶ Represents a rotation in 3D about the origin
The attribute r contains the actual rotation matrix, which is a numpy array with shape (3, 3).
- Argument:
r
– rotation matrix, a 3 by 3 orthonormal array-like object
-
apply_to
(x, columns=False)¶ Apply this rotation to the given object
The argument can be several sorts of objects:
np.array
with shape (3, )np.array
with shape (N, 3)np.array
with shape (3, N), usecolumns=True
Translation
Rotation
Complete
UnitCell
In case of arrays, the 3D vectors are rotated. In case of trans- formations, a transformation is returned that consists of this rotation applied AFTER the given translation. In case of a unit cell, a unit cell with rotated cell vectors is returned.
This method is equivalent to
self*x
.
-
compare
(other, r_threshold=0.001)¶ Compare two rotations
The RMSD of the rotation matrices is computed. The return value is True when the RMSD is below the threshold, i.e. when the two rotations are almost identical.
-
classmethod
from_matrix
(m)¶ Initialize a rotation from a 4x4 matrix
-
classmethod
from_properties
(angle, axis, invert)¶ Initialize a rotation based on the properties
-
classmethod
identity
()¶ Return the identity transformation
-
inv
¶ Cached attribute: The inverse rotation.
-
matrix
¶ Cached attribute: The 4x4 matrix representation of this rotation.
-
properties
¶ Cached attribute: Rotation properties: angle, axis, invert.
-
r
¶ Read-only attribute: the rotation matrix.
The attribute must satisfy the following conditions:
- May not be None.
- Must be an instance of
<type 'numpy.ndarray'>
- Must have dimension 2.
- Must have shape (3, 3).
- Must have dtype
<type 'numpy.floating'>
. - Special conditions: the columns must orthogonal.
-
classmethod
random
()¶ Return a random rotation
-
class
molmod.transformations.
Complete
(r, t)¶ Represents a rotation and translation in 3D
The attribute t contains the actual translation vector, which is a numpy array with three elements. The attribute r contains the actual rotation matrix, which is a numpy array with shape (3, 3).
Internally the translation part is always applied after the rotation part.
- Arguments:
r
– rotation matrix, a 3 by 3 orthonormal array-like objectt
– translation vector, a list-like object with three numbers
-
classmethod
about_axis
(center, angle, axis, invert=False)¶ Create transformation that represents a rotation about an axis
- Arguments:
center
– Point on the axisangle
– Rotation angleaxis
– Rotation axisinvert
– When True, an inversion rotation is constructed [default=False]
-
apply_to
(x, columns=False)¶ Apply this transformation to the given object
The argument can be several sorts of objects:
np.array
with shape (3, )np.array
with shape (N, 3)np.array
with shape (3, N), usecolumns=True
Translation
Rotation
Complete
UnitCell
In case of arrays, the 3D vectors are transformed. In case of trans- formations, a transformation is returned that consists of this transformation applied AFTER the given translation. In case of a unit cell, a unit cell with rotated cell vectors is returned. (The translational part does not affect the unit cell.)
This method is equivalent to self*x.
-
classmethod
cast
(c)¶ Convert the first argument into a Complete object
-
compare
(other, t_threshold=0.001, r_threshold=0.001)¶ Compare two transformations
The RMSD values of the rotation matrices and the translation vectors are computed. The return value is True when the RMSD values are below the thresholds, i.e. when the two transformations are almost identical.
-
classmethod
from_matrix
(m)¶ Initialize a complete transformation from a 4x4 matrix
-
classmethod
from_properties
(angle, axis, invert, translation)¶ Initialize a transformation based on the properties
-
classmethod
identity
()¶ Return the identity transformation
-
inv
¶ Cached attribute: The inverse transformation.
-
matrix
¶ Cached attribute: The 4x4 matrix representation of this transformation.
-
properties
¶ Cached attribute: Transformation properties: angle, axis, invert, translation.
-
r
¶ Read-only attribute: the rotation matrix.
The attribute must satisfy the following conditions:
- May not be None.
- Must be an instance of
<type 'numpy.ndarray'>
- Must have dimension 2.
- Must have shape (3, 3).
- Must have dtype
<type 'numpy.floating'>
. - Special conditions: the columns must orthogonal.
-
t
¶ Read-only attribute: the translation vector.
The attribute must satisfy the following conditions:
- May not be None.
- Must be an instance of
<type 'numpy.ndarray'>
- Must have dimension 1.
- Must have shape (3).
- Must have dtype
<type 'numpy.floating'>
.
-
molmod.transformations.
superpose
(ras, rbs, weights=None)¶ Compute the transformation that minimizes the RMSD between the points ras and rbs
- Arguments:
ras
– anp.array
with 3D coordinates of geometry A, shape=(N,3)rbs
– anp.array
with 3D coordinates of geometry B, shape=(N,3)- Optional arguments:
weights
– a numpy array with fitting weights for each coordinate, shape=(N,)- Return value:
transformation
– the transformation that brings geometry A into overlap with geometry B
Each row in ras and rbs represents a 3D coordinate. Corresponding rows contain the points that are brought into overlap by the fitting procedure. The implementation is based on the Kabsch Algorithm:
-
molmod.transformations.
fit_rmsd
(ras, rbs, weights=None)¶ Fit geometry rbs onto ras, returns more info than superpose
- Arguments:
ras
– a numpy array with 3D coordinates of geometry A, shape=(N,3)rbs
– a numpy array with 3D coordinates of geometry B, shape=(N,3)- Optional arguments:
weights
– a numpy array with fitting weights for each coordinate, shape=(N,)- Return values:
transformation
– the transformation that brings geometry A into overlap with geometry Brbs_trans
– the transformed coordinates of geometry Brmsd
– the rmsd of the distances between corresponding atoms in geometry A and B
This is a utility routine based on the function superpose. It just computes rbs_trans and rmsd after calling superpose with the same arguments