NAME
HackaMol - HackaMol: Object-Oriented Library for Molecular Hacking
VERSION
version 0.053
DESCRIPTION
The HackaMol publication has a more complete description of the library (pdf available from researchgate).
Citation: J. Chem. Inf. Model., 2015, 55, 721
Loading the HackaMol library in a script with
use HackaMol;
provides attributes and methods of a builder class. It also loads all the classes provided by the core so including them is not necessary, e.g.:
use HackaMol::Atom;
use HackaMol::Bond;
use HackaMol::Angle;
use HackaMol::Dihedral;
use HackaMol::AtomGroup;
use HackaMol::Molecule;
The methods, described below, facilitate the creation of objects from files and other objects. It is a builder class that evolves more rapidly than the classes for the molecular objects. For example, the superpose_rt and rmsd methods will likely be moved to a more suitable class or functional module.
METHODS
pdbid_mol
one argument: pdbid
This method will download the pdb, unless it exists, and load it into a HackaMol::Molecule object. For example,
my $mol = HackaMol->new->pdbid_mol('2cba');
read_file_atoms
one argument: filename.
This method parses the file (e.g. file.xyz, file.pdb) and returns an array of HackaMol::Atom objects. It uses the filename postfix to decide which parser to use. e.g. file.pdb will trigger the pdb parser.
read_file_mol
one argument: filename.
This method parses the file (e.g. file.xyz, file.pdb) and returns a HackaMol::Molecule object.
read_file_push_coords_mol
two arguments: filename and a HackaMol::Molecule object.
This method reads the coordinates from a file and pushes them into the atoms contained in the molecule. Thus, the atoms in the molecule and the atoms in the file must be the same.
build_bonds
takes a list of atoms and returns a list of bonds. The bonds are generated for "list neighbors" by simply stepping through the atom list one at a time. e.g.
my @bonds = $hack->build_bonds(@atoms[1,3,5]);
will return two bonds: B13 and B35
build_angles
takes a list of atoms and returns a list of angles. The angles are generated analagously to build_bonds, e.g.
my @angles = $hack->build_angles(@atoms[1,3,5]);
will return one angle: A135
build_dihedrals
takes a list of atoms and returns a list of dihedrals. The dihedrals are generated analagously to build_bonds, e.g.
my @dihedral = $hack->build_dihedrals(@atoms[1,3,5]);
will croak! you need atleast four atoms.
my @dihedral = $hack->build_dihedrals(@atoms[1,3,5,6,9]);
will return two dihedrals: D1356 and D3569
group_by_atom_attr
args: atom attribute (e.g. 'name') ; list of atoms (e.g. $mol->all_atoms)
returns array of AtomGroup objects
group_by_atom_attrs
args: array reference of multiple atom attributes (e.g. ['resname', 'chain' ]); list of atoms.
returns array of AtomGroup objects
find_bonds_brute
The arguments are key_value pairs of bonding criteria (see example below).
This method returns bonds between bond_atoms and the candidates using the criteria (many of wich have defaults).
my @oxy_bonds = $hack->find_bonds_brute(
bond_atoms => [$hg],
candidates => [$mol->all_atoms],
fudge => 0.45,
max_bonds => 6,
);
fudge is optional with Default is 0.45 (open babel uses same default); max_bonds is optional with default of 99. max_bonds is compared against the atom bond count, which are incremented during the search. Before returning the bonds, the bond_count are returned the values before the search. For now, molecules are responsible for setting the number of bonds in atoms. find_bonds_brute uses a bruteforce algorithm that tests the interatomic separation against the sum of the covalent radii + fudge. It will not test for bond between atoms if either atom has >= max_bonds. It does not return a self bond for an atom ( next if refaddr($ati) == refaddr($atj)
).
mol_disulfide_bonds
my @ss = $bldr->mol_disulfide_bonds($mol, [$fudge]); # default $fudge = 0.15
Returns all disulfide bonds with lengths leq 2 x S->covalent_radius + $fudge. $mol can be an AtomGroup or Molecule.
find_disulfide_bonds
my @ss = $bldr->find_disulfide_bonds(@atoms);
DEPRECATED. Uses find_bonds_brute with fudge of 0.15 and max_bonds 1. mol_disulfides was developed after finding funky cysteine clusters (e.g. 1f3h)
rmsd ($group1,$group2,$weights)
my $rmsd = $bldr->rmsd($group1, $group2, [$weights]); #optional weights need to be tested
Computes the root mean square deviation between atomic vectors of the two AtomGroup/Molecule objects.
superpose_rt ($group1, $group2)
WARNING: 1. needs more testing (feel free to contribute tests!). 2. may shift to another class.
args: two hackmol objects (HackaMol::AtomGroup or HackaMol::Molecule) with same number of atoms. This method is intended to be very flexible. It does not check meta data of the atoms, it just pulls the vectors in each group to calculate the rotation matrix and translation vector needed to superpose the second set on to the first set.
The vectors assumed to be in the same order, that's it!
A typical workflow:
my $bb1 = $mol1->select_group('backbone');
my $bb2 = $mol2->select_group('backbone');
my ($rmat,$trans,$rmsd) = HackaMol->new()->superpose_rt($bb1,$bb2);
# $rmsd is the RMSD between backbones
# to calculate rmsd between other atoms after the backbone alignment
$mol2->rotate_translate($rmat,$trans);
my $total_rmsd = HackaMol->new()->rmsd($mol1,$mol2);
# $total_rmsd is from all atoms in each mol
the algorithm is lifted from Bio::PDB::Structure, which implements algorithm from S. Kearsley, Acta Cryst. A45, 208-210 1989
returns: 1. rotation matrix [3 rows, each is a MVR , e.g. x' = row_1 * xyz] 2. translation vector (MVR) 3. rmsd
ATTRIBUTES
name
name is a rw str provided by HackaMol::NameRole.
readline_func
hook to add control to parsing files
HackaMol->new(
readline_func => sub {return "PDB_SKIP" unless /LYS/ }
)
->pdbid_mol("2cba")
->print_pdb; # will parse lysines because the PDB reader looks for the PDB_SKIP return value
SYNOPSIS
# simple example: load pdb file and extract the disulfide bonds
use HackaMol;
my $bldr = HackaMol->new( name => 'builder');
my $mol = $bldr->pdbid_mol('1kni');
my @disulfide_bonds = $bldr->find_disulfide_bonds( $mol->all_atoms );
print $_->dump foreach @disulfide_bonds;
See the above executed in this linked notebook
SEE ALSO
AUTHOR's PERSPECTIVE
Over the years, I have found HackaMol most expeditious wrt testing out ideas or whipping up setups and analyses for thousands of molecules each with several thousand atoms. This is the realm of varied systems (lots of molecules often from the protein databank). For the analysis of MD simulations that hit the next order of magnitude wrt numbers of atoms and frames, I use other tools (such as the python library, MDAnalysis). This is the realm where the system (molecular make-up) varies more slowly often with modeled solvent potentially with millions of simulated frames. The future development of HackaMol will seek to remain in its lane and better serve that space.
Perl is forever flexible and powerful; HackaMol has the potential to be a broadly useful tool in this region of molecular information, which is often dirty and not well-specified. Once the object system is stable in the Perl core, I plan to rework HackaMol to remove dependencies and improve performance. However, this tool will most likely never advance to the realm of analyzing molecular dynamics simulations. For the analysis of MD simulations with explicit solvent, please look into the python libraries MDAnalysis and MDTraj. Bio3D, in R, is also very cool and able to nicely handle simulation data for biological molecules stripped of solvent. In my view, Bio3D seems to fill the space between HackaMol and MD simulations with explicit solvent. These suggested tools are only the ones I am familiar with; I'm sure there are other very cool tools out there!
EXTENDS
CONSUMES
AUTHOR
Demian Riccardi <demianriccardi@gmail.com>
COPYRIGHT AND LICENSE
This software is copyright (c) 2017 by Demian Riccardi.
This is free software; you can redistribute it and/or modify it under the same terms as the Perl 5 programming language system itself.