NAME
UMLS::Interface - Perl interface to the Unified Medical Language System (UMLS)
SYNOPSIS
use UMLS::Interface;
$umls = UMLS::Interface->new();
die "Unable to create UMLS::Interface object.\n" if(!$umls);
($errCode, $errString) = $umls->getError();
die "$errString\n" if($errCode);
my $term1 = "blood";
my @tList1 = $umls->getConceptList($term1);
my $cui1 = pop @tList1;
my $term2 = "cell";
my @tList2 = $umls->getConceptList($term2);
my $cui2 = pop @tList2;
my $exists1 = $umls->checkConceptExists($cui1);
my $exists2 = $umls->checkConceptExists($cui2);
if($exists1) { print "$term1($cui1) exists in your UMLS view.\n"; }
else { print "$term1($cui1) does not exist in your UMLS view.\n"; }
if($exists2) { print "$term2($cui2) exists in your UMLS view.\n"; }
else { print "$term2($cui2) does not exist in your UMLS view.\n"; }
print "\n";
my @cList1 = $umls->getTermList($cui1);
my @cList2 = $umls->getTermList($cui2);
print "The terms associated with $term1 ($cui1):\n";
foreach my $c1 (@cList1) {
print " => $c1\n";
} print "\n";
print "The terms associated with $term2 ($cui2):\n";
foreach my $c2 (@cList2) {
print " => $c2\n";
} print "\n";
my $lcs = $umls->findLeastCommonSubsumer($cui1, $cui2);
print "The least common subsumer between $term1 ($cui1) and \n";
print "$term2 ($cui2) is $lcs\n\n";
my @shortestpath = $umls->findShortestPath($cui1, $cui2);
print "The shortest path between $term1 ($cui1) and $term2 ($cui2):\n";
print " => @shortestpath\n\n";
my $pathstoroot = $umls->pathsToRoot($cui1);
print "The paths from $term1 ($cui1) and the root:\n";
foreach $path (@{$pathstoroot}) {
print " => $path\n";
} print "\n";
my $mindepth = $umls->findMinimumDepth($cui1);
my $maxdepth = $umls->findMaximumDepth($cui1);
print "The minimum depth of $term1 ($cui1) is $mindepth\n";
print "The maximum depth of $term1 ($cui1) is $maxdepth\n\n";
my @children = $umls->getChildren($cui2);
print "The child(ren) of $term2 ($cui2) are: @children\n\n";
my @parents = $umls->getParents($cui2);
print "The parent(s) of $term2 ($cui2) are: @parents\n\n";
my @definitions = $umls->getCuiDef($cui1);
print "The definition(s) of $term1 ($cui1) are:\n";
foreach $def (@definitions) {
print " => $def\n"; $i++;
} print "\n";
print "The semantic type(s) of $term1 ($cui1) and the semantic\n";
print "definition are:\n";
my @sts = $umls->getSt($cui1);
foreach my $st (@sts) {
my @abrs = $umls->getStAbr($st);
foreach my $abr (@abrs) {
my $string = $umls->getStString($abr);
my $def = $umls->getStDef($abr);
print " => $string ($abr) : $def\n";
}
} print "\n";
ABSTRACT
This package provides a Perl interface to the Unified Medical Language System. The package is set up to access pre-specified sources of the UMLS present in a mysql database. The package was essentially created for use with the UMLS::Similarity package for measuring the semantic relatedness of concepts.
INSTALL
To install the module, run the following magic commands:
perl Makefile.PL
make
make test
make install
This will install the module in the standard location. You will, most probably, require root privileges to install in standard system directories. To install in a non-standard directory, specify a prefix during the 'perl Makefile.PL' stage as:
perl Makefile.PL PREFIX=/home/sid
It is possible to modify other parameters during installation. The details of these can be found in the ExtUtils::MakeMaker documentation. However, it is highly recommended not messing around with other parameters, unless you know what you're doing.
DESCRIPTION
This package provides a Perl interface to the Unified Medical Language System (UMLS). The UMLS is a knowledge representation framework encoded designed to support broad scope biomedical research queries. There exists three major sources in the UMLS. The Metathesaurus which is a taxonomy of medical concepts, the Semantic Network which categorizes concepts in the Metathesaurus, and the SPECIALIST Lexicon which contains a list of biomedical and general English terms used in the biomedical domain. The UMLS-Interface package is set up to access the Metathesaurus and the Semantic Network present in a mysql database.
DATABASE SETUP
The interface assumes that the UMLS is present as a mysql database. The name of the database can be passed as configuration options at initialization. However, if the names of the databases are not provided at initialization, then default value is used -- the database for the UMLS is called 'umls'.
The UMLS database must contain six tables: 1. MRREL 2. MRCONSO 3. MRSAB 4. MRDOC 5. MRDEF 6. MRSTY 7. SRDEF
All other tables in the databases will be ignored, and any of these tables missing would raise an error.
A script explaining how to install the UMLS and the mysql database are in the INSTALL file.
INITIALIZING THE MODULE
To create an instance of the interface object, using default values for all configuration options:
use UMLS::Interface;
my $interface = UMLS::Interface->new();
Database connection options can be passed through the my.cnf file. For example: [client] user = <username> password = <password> port = 3306 socket = /tmp/mysql.sock database = umls
Or through the by passing the connection information when first instantiating an instance. For example:
$umls = UMLS::Interface->new({"driver" => "mysql",
"database" => "$database",
"username" => "$opt_username",
"password" => "$opt_password",
"hostname" => "$hostname",
"socket" => "$socket"});
'driver' -> Default value 'mysql'. This option specifies the Perl
DBD driver that should be used to access the
database. This implies that the some other DBMS
system (such as PostgresSQL) could also be used,
as long as there exist Perl DBD drivers to
access the database.
'umls' -> Default value 'umls'. This option specifies the name
of the UMLS database.
'hostname' -> Default value 'localhost'. The name or the IP address
of the machine on which the database server is
running.
'socket' -> Default value '/tmp/mysql.sock'. The socket on which
the database server is using.
'port' -> The port number on which the database server accepts
connections.
'username' -> Username to use to connect to the database server. If
not provided, the module attempts to connect as an
anonymous user.
'password' -> Password for access to the database server. If not
provided, the module attempts to access the server
without a password.
More information is provided in the INSTALL file Stage 5 Step D (search for 'Step D' and you will find it).
Configuration file
There exist a configuration files to specify which source and what relations are to be used. The default source is the Medical Subject Heading (MSH) vocabulary and the default relations are the PAR/CHD relation.
'config' -> File containing the source and relation parameters
The configuration file can be passed through the instantiation of the UMLS-Interface. Similar to passing the connection options. For example:
$umls = UMLS::Interface->new({"driver" => "mysql",
"database" => $database,
"username" => $opt_username,
"password" => $opt_password,
"hostname" => $hostname,
"socket" => $socket,
"config" => $configfile});
or
$umls = UMLS::Interface->new({"config" => $configfile});
The format of the configuration file is as follows:
SAB :: <include|exclude> <source1, source2, ... sourceN>
REL :: <include|exclude> <relation1, relation2, ... relationN>
For example, if we wanted to use the MSH vocabulary with only the RB/RN relations, the configuration file would be:
SAB :: include MSH REL :: include RB, RN
or
SAB :: include MSH REL :: exclude PAR, CHD
If you run the an example program in the utils/ directory, an example of the default configuration file will be printed out in the configuration directory (the configuration directory can be specified during the first run - go run one and you will see what I mean).
SEE ALSO
http://tech.groups.yahoo.com/group/umls-similarity/
http://search.cpan.org/dist/UMLS-Similarity/
AUTHOR
Bridget T McInnes <bthomson@cs.umn.edu> Ted Pedersen <tpederse@d.umn.edu>
COPYRIGHT
Copyright (c) 2007-2009
Bridget T. McInnes, University of Minnesota
bthomson at cs.umn.edu
Ted Pedersen, University of Minnesota Duluth
tpederse at d.umn.edu
Siddharth Patwardhan, University of Utah, Salt Lake City
sidd@cs.utah.edu
Serguei Pakhomov, University of Minnesota Twin Cities
pakh0002@umn.edu
This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version.
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
You should have received a copy of the GNU General Public License along with this program; if not, write to
The Free Software Foundation, Inc.,
59 Temple Place - Suite 330,
Boston, MA 02111-1307, USA.