NAME

HackaMol - HackaMol: Object-Oriented Library for Molecular Hacking

VERSION

version 0.053

DESCRIPTION

The HackaMol publication has a more complete description of the library (pdf available from researchgate).

Citation: J. Chem. Inf. Model., 2015, 55, 721

Loading the HackaMol library in a script with

use HackaMol;

provides attributes and methods of a builder class. It also loads all the classes provided by the core so including them is not necessary, e.g.:

use HackaMol::Atom;
use HackaMol::Bond;
use HackaMol::Angle;
use HackaMol::Dihedral;
use HackaMol::AtomGroup;
use HackaMol::Molecule;

The methods, described below, facilitate the creation of objects from files and other objects. It is a builder class that evolves more rapidly than the classes for the molecular objects. For example, the superpose_rt and rmsd methods will likely be moved to a more suitable class or functional module.

METHODS

pdbid_mol

one argument: pdbid

This method will download the pdb, unless it exists, and load it into a HackaMol::Molecule object. For example,

my $mol = HackaMol->new->pdbid_mol('2cba');

read_file_atoms

one argument: filename.

This method parses the file (e.g. file.xyz, file.pdb) and returns an array of HackaMol::Atom objects. It uses the filename postfix to decide which parser to use. e.g. file.pdb will trigger the pdb parser.

read_file_mol

one argument: filename.

This method parses the file (e.g. file.xyz, file.pdb) and returns a HackaMol::Molecule object.

read_file_push_coords_mol

two arguments: filename and a HackaMol::Molecule object.

This method reads the coordinates from a file and pushes them into the atoms contained in the molecule. Thus, the atoms in the molecule and the atoms in the file must be the same.

build_bonds

takes a list of atoms and returns a list of bonds. The bonds are generated for "list neighbors" by simply stepping through the atom list one at a time. e.g.

my @bonds = $hack->build_bonds(@atoms[1,3,5]);

will return two bonds: B13 and B35

build_angles

takes a list of atoms and returns a list of angles. The angles are generated analagously to build_bonds, e.g.

my @angles = $hack->build_angles(@atoms[1,3,5]);

will return one angle: A135

build_dihedrals

takes a list of atoms and returns a list of dihedrals. The dihedrals are generated analagously to build_bonds, e.g.

my @dihedral = $hack->build_dihedrals(@atoms[1,3,5]);

will croak! you need atleast four atoms.

my @dihedral = $hack->build_dihedrals(@atoms[1,3,5,6,9]);

will return two dihedrals: D1356 and D3569

group_by_atom_attr

args: atom attribute (e.g. 'name') ; list of atoms (e.g. $mol->all_atoms)

returns array of AtomGroup objects

group_by_atom_attrs

args: array reference of multiple atom attributes (e.g. ['resname', 'chain' ]); list of atoms.

returns array of AtomGroup objects

find_bonds_brute

The arguments are key_value pairs of bonding criteria (see example below).

This method returns bonds between bond_atoms and the candidates using the criteria (many of wich have defaults).

my @oxy_bonds = $hack->find_bonds_brute(
                                  bond_atoms => [$hg],
                                  candidates => [$mol->all_atoms],
                                  fudge      => 0.45,
                                  max_bonds  => 6,
);

fudge is optional with Default is 0.45 (open babel uses same default); max_bonds is optional with default of 99. max_bonds is compared against the atom bond count, which are incremented during the search. Before returning the bonds, the bond_count are returned the values before the search. For now, molecules are responsible for setting the number of bonds in atoms. find_bonds_brute uses a bruteforce algorithm that tests the interatomic separation against the sum of the covalent radii + fudge. It will not test for bond between atoms if either atom has >= max_bonds. It does not return a self bond for an atom ( next if refaddr($ati) == refaddr($atj) ).

mol_disulfide_bonds

my @ss = $bldr->mol_disulfide_bonds($mol, [$fudge]); # default $fudge = 0.15

Returns all disulfide bonds with lengths leq 2 x S->covalent_radius + $fudge. $mol can be an AtomGroup or Molecule.

find_disulfide_bonds

my @ss = $bldr->find_disulfide_bonds(@atoms); 

DEPRECATED. Uses find_bonds_brute with fudge of 0.15 and max_bonds 1. mol_disulfides was developed after finding funky cysteine clusters (e.g. 1f3h)

rmsd ($group1,$group2,$weights)

my $rmsd = $bldr->rmsd($group1, $group2, [$weights]); #optional weights need to be tested

Computes the root mean square deviation between atomic vectors of the two AtomGroup/Molecule objects.

superpose_rt ($group1, $group2)

WARNING: 1. needs more testing (feel free to contribute tests!). 2. may shift to another class.

args: two hackmol objects (HackaMol::AtomGroup or HackaMol::Molecule) with same number of atoms. This method is intended to be very flexible. It does not check meta data of the atoms, it just pulls the vectors in each group to calculate the rotation matrix and translation vector needed to superpose the second set on to the first set.

The vectors assumed to be in the same order, that's it!

A typical workflow:

my $bb1 = $mol1->select_group('backbone');
my $bb2 = $mol2->select_group('backbone');
my ($rmat,$trans,$rmsd) = HackaMol->new()->superpose_rt($bb1,$bb2);
# $rmsd is the RMSD between backbones

# to calculate rmsd between other atoms after the backbone alignment
$mol2->rotate_translate($rmat,$trans);
my $total_rmsd = HackaMol->new()->rmsd($mol1,$mol2);
# $total_rmsd is from all atoms in each mol

the algorithm is lifted from Bio::PDB::Structure, which implements algorithm from S. Kearsley, Acta Cryst. A45, 208-210 1989

returns: 1. rotation matrix [3 rows, each is a MVR , e.g. x' = row_1 * xyz] 2. translation vector (MVR) 3. rmsd

ATTRIBUTES

name

name is a rw str provided by HackaMol::NameRole.

readline_func

hook to add control to parsing files

HackaMol->new(
                readline_func => sub {return "PDB_SKIP" unless /LYS/ }
)
->pdbid_mol("2cba")
->print_pdb; # will parse lysines because the PDB reader looks for the PDB_SKIP return value

SYNOPSIS

# simple example: load pdb file and extract the disulfide bonds

use HackaMol;

my $bldr = HackaMol->new( name => 'builder');
my $mol  = $bldr->pdbid_mol('1kni');

my @disulfide_bonds = $bldr->find_disulfide_bonds( $mol->all_atoms );

print $_->dump foreach @disulfide_bonds;

See the above executed in this linked notebook

SEE ALSO

AUTHOR's PERSPECTIVE

Over the years, I have found HackaMol most expeditious wrt testing out ideas or whipping up setups and analyses for thousands of molecules each with several thousand atoms. This is the realm of varied systems (lots of molecules often from the protein databank). For the analysis of MD simulations that hit the next order of magnitude wrt numbers of atoms and frames, I use other tools (such as the python library, MDAnalysis). This is the realm where the system (molecular make-up) varies more slowly often with modeled solvent potentially with millions of simulated frames. The future development of HackaMol will seek to remain in its lane and better serve that space.

Perl is forever flexible and powerful; HackaMol has the potential to be a broadly useful tool in this region of molecular information, which is often dirty and not well-specified. Once the object system is stable in the Perl core, I plan to rework HackaMol to remove dependencies and improve performance. However, this tool will most likely never advance to the realm of analyzing molecular dynamics simulations. For the analysis of MD simulations with explicit solvent, please look into the python libraries MDAnalysis and MDTraj. Bio3D, in R, is also very cool and able to nicely handle simulation data for biological molecules stripped of solvent. In my view, Bio3D seems to fill the space between HackaMol and MD simulations with explicit solvent. These suggested tools are only the ones I am familiar with; I'm sure there are other very cool tools out there!

EXTENDS

CONSUMES

AUTHOR

Demian Riccardi <demianriccardi@gmail.com>

COPYRIGHT AND LICENSE

This software is copyright (c) 2017 by Demian Riccardi.

This is free software; you can redistribute it and/or modify it under the same terms as the Perl 5 programming language system itself.