NAME
Chemistry::File::SMILES - SMILES linear notation parser/writer
SYNOPSYS
#!/usr/bin/perl
use Chemistry::File::SMILES;
# parse a SMILES string
my $s = 'C1CC1(=O)[O-]';
my $mol = Chemistry::Mol->parse($s, format => 'smiles');
# print a SMILES string
print $mol->print(format => 'smiles');
# print a unique (canonical) SMILES string
print $mol->print(format => 'smiles', unique => 1);
# parse a SMILES file
my @mols = Chemistry::Mol->read("file.smi", format => 'smiles');
# write a multiline SMILES file
Chemistry::Mol->write("file.smi", mols => [@mols]);
DESCRIPTION
This module parses a SMILES (Simplified Molecular Input Line Entry Specification) string. This is a File I/O driver for the PerlMol project. http://www.perlmol.org/. It registers the 'smiles' format with Chemistry::Mol.
This parser interprets anything after whitespace as the molecule's name; for example, when the following SMILES string is parsed, $mol->name will be set to "Methyl chloride":
CCl Methyl chloride
The name is not included by default on output. However, if the name
option is defined, the name will be included after the SMILES string, separated by a tab.
print $mol->print(format => 'smiles', name => 1);
Multiline SMILES and SMILES files
A file or string can contain multiple molecules, one per line.
CCl Methyl chloride
CO Methanol
Files with the extension '.smi' are assumed to have this format.
OPTIONS
- aromatic
-
On output, detect aromatic atoms and bonds by means of the Chemistry::Ring module, and represent the organic aromatic atoms with lowercase symbols.
- unique
-
When used on output, canonicalize the structure if it hasn't been canonicalized already and generate a unique SMILES string. This option implies "aromatic".
- kekulize
-
When used on input, assign single or double bond orders to "aromatic" or otherwise unspecified bonds (i.e., generate the Kekule structure). If false, the bond orders will remain single. This option is true by default. This uses
assign_bond_orders
from the Chemistry::Bond::Find module.
CAVEATS
Reading branches that start before an atom, such as (OC)C, which should be equivalent to C(OC) and COC, according to some variants of the SMILES specification. Many other tools don't implement this rule either.
VERSION
0.41
SEE ALSO
Chemistry::Mol, Chemistry::File
The SMILES Home Page at http://www.daylight.com/dayhtml/smiles/ The Daylight Theory Manual at http://www.daylight.com/dayhtml/doc/theory/theory.smiles.html
The PerlMol website http://www.perlmol.org/
AUTHOR
Ivan Tubert <itub@cpan.org>
COPYRIGHT
Copyright (c) 2004 Ivan Tubert. All rights reserved. This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.