NAME

UMLS::Interface - Perl interface to the Unified Medical Language System (UMLS)

SYNOPSIS

use UMLS::Interface;

$umls = UMLS::Interface->new(); 

die "Unable to create UMLS::Interface object.\n" if(!$umls); 

my $root = $umls->root();

my $term1    = "skull";

my $tList1   = $umls->getConceptList($term1);
my $cui1     = pop @{$tList1};

my $term2    = "hand";
my $tList2   = $umls->getDefConceptList($term2);

my $cui2     = shift @{$tList2};
my $exists1  = $umls->exists($cui1);
my $exists2  = $umls->exists($cui2);

if($exists1) { print "The concept $term1 ($cui1) exists in your UMLS view.\n"; }
else         { print "The concept $term1 ($cui1) does not exist in your UMLS view.\n"; }

if($exists2) { print "The concept $term2 ($cui2) exists in your UMLS view.\n"; }
else         { print "The concept $term2 ($cui2) does not exist in your UMLS view.\n"; }
print "\n";

my $cList1   = $umls->getTermList($cui1);
my $cList2   = $umls->getDefTermList($cui2);

print "The terms associated with $term1 ($cui1) using the SAB parameter:\n";
foreach my $c1 (@{$cList1}) {
   print " => $c1\n";
} print "\n";

print "The terms associated with $term2 ($cui2) using the SABDEF parameter:\n";
foreach my $c2 (@{$cList2}) {
   print " => $c2\n";
} print "\n";

my $lcs = $umls->findLeastCommonSubsumer($cui1, $cui2);
print "The least common subsumer between $term1 ($cui1) and ";
print "$term2 ($cui2) is @{$lcs}\n\n";

my $shortestpath = $umls->findShortestPath($cui1, $cui2);
print "The shortest path between $term1 ($cui1) and $term2 ($cui2):\n";
print "  => @{$shortestpath}\n\n";

my $pathstoroot   = $umls->pathsToRoot($cui1);
print "The paths from $term1 ($cui1) and the root:\n";
foreach  $path (@{$pathstoroot}) {
   print "  => $path\n";
} print "\n";

my $mindepth = $umls->findMinimumDepth($cui1);
my $maxdepth = $umls->findMaximumDepth($cui1);
print "The minimum depth of $term1 ($cui1) is $mindepth\n";
print "The maximum depth of $term1 ($cui1) is $maxdepth\n\n";

my $children = $umls->getChildren($cui2); 
print "The child(ren) of $term2 ($cui2) are: @{$children}\n\n";

my $parents = $umls->getParents($cui2);
print "The parent(s) of $term2 ($cui2) are: @{$parents}\n\n";

my $relations = $umls->getRelations($cui2);
print "The relation(s) of $term2 ($cui2) are: @{$relations}\n\n";

my $rels = $umls->getRelated($cui2, "PAR");
print "The parents(s) of $term2 ($cui2) are: @{$rels}\n\n";

my $definitions = $umls->getCuiDef($cui1);
print "The definition(s) of $term1 ($cui1) are:\n";
foreach $def (@{$definitions}) {
   print "  => $def\n"; $i++;
} print "\n";

my $sabs = $umls->getSab($cui1);

print "The sources containing $term1 ($cui1) are: @{$sabs}\n\n";

print "The semantic type(s) of $term1 ($cui1) and the semantic\n";

print "definition are:\n";
my $sts = $umls->getSt($cui1);
foreach my $st (@{$sts}) {

   my $abr = $umls->getStAbr($st);
   my $string = $umls->getStString($abr);
   my $def    = $umls->getStDef($abr);
   print "  => $string ($abr) : @{$def}\n";

} print "\n";

$umls->removeConfigFiles();

$umls->dropConfigTable();

ABSTRACT

This package provides a Perl interface to the Unified Medical Language System. The package is set up to access pre-specified sources of the UMLS present in a mysql database. The package was essentially created for use with the UMLS::Similarity package for measuring the semantic relatedness of concepts.

INSTALL

To install the module, run the following magic commands:

perl Makefile.PL
make
make test
make install

This will install the module in the standard location. You will, most probably, require root privileges to install in standard system directories. To install in a non-standard directory, specify a prefix during the 'perl Makefile.PL' stage as:

perl Makefile.PL PREFIX=/home/sid

It is possible to modify other parameters during installation. The details of these can be found in the ExtUtils::MakeMaker documentation. However, it is highly recommended not messing around with other parameters, unless you know what you're doing.

DESCRIPTION

This package provides a Perl interface to the Unified Medical Language System (UMLS). The UMLS is a knowledge representation framework encoded designed to support broad scope biomedical research queries. There exists three major sources in the UMLS. The Metathesaurus which is a taxonomy of medical concepts, the Semantic Network which categorizes concepts in the Metathesaurus, and the SPECIALIST Lexicon which contains a list of biomedical and general English terms used in the biomedical domain. The UMLS-Interface package is set up to access the Metathesaurus and the Semantic Network present in a mysql database.

DATABASE SETUP

The interface assumes that the UMLS is present as a mysql database. The name of the database can be passed as configuration options at initialization. However, if the names of the databases are not provided at initialization, then default value is used -- the database for the UMLS is called 'umls'.

The UMLS database must contain six tables: 1. MRREL 2. MRCONSO 3. MRSAB 4. MRDOC 5. MRDEF 6. MRSTY 7. SRDEF

All other tables in the databases will be ignored, and any of these tables missing would raise an error.

A script explaining how to install the UMLS and the mysql database are in the INSTALL file.

INITIALIZING THE MODULE

To create an instance of the interface object, using default values for all configuration options:

use UMLS::Interface;
my $interface = UMLS::Interface->new();

Database connection options can be passed through the my.cnf file. For example: [client] user = <username> password = <password> port = 3306 socket = /tmp/mysql.sock database = umls

Or through the by passing the connection information when first instantiating an instance. For example:

    $umls = UMLS::Interface->new({"driver" => "mysql", 
				  "database" => "$database", 
				  "username" => "$opt_username",  
				  "password" => "$opt_password", 
				  "hostname" => "$hostname", 
				  "socket"   => "$socket"}); 

  'driver'       -> Default value 'mysql'. This option specifies the Perl 
                    DBD driver that should be used to access the
                    database. This implies that the some other DBMS
                    system (such as PostgresSQL) could also be used,
                    as long as there exist Perl DBD drivers to
                    access the database.
  'umls'         -> Default value 'umls'. This option specifies the name
                    of the UMLS database.
  'hostname'     -> Default value 'localhost'. The name or the IP address
                    of the machine on which the database server is
                    running.
  'socket'       -> Default value '/tmp/mysql.sock'. The socket on which 
                    the database server is using.
  'port'         -> The port number on which the database server accepts
                    connections.
  'username'     -> Username to use to connect to the database server. If
                    not provided, the module attempts to connect as an
                    anonymous user.
  'password'     -> Password for access to the database server. If not
                    provided, the module attempts to access the server
                    without a password.

More information is provided in the INSTALL file Stage 5 Step D (search for 'Step D' and you will find it).

PARAMETERS

You can also pass other parameters which controls the functionality of the Interface.pm module.

    $umls = UMLS::Interface->new({"forcerun"      => "1",
				  "realtime"      => "1",
				  "cuilist"       => "file",  
				  "verbose"       => "1", 
                                  "debugpath"     => "file"});

  'forcerun'     -> This parameter will bypass any command prompts such 
                    as asking if you would like to continue with the index 
                    creation. 

  'realtime'     -> This parameter will not create a database of path 
                    information (what we refer to as the index) but obtain
                    the path information about a concept on the fly

  'cuilist'      -> This parameter contains a file containing a list 
                    of CUIs in which the path information should be 
                    store for - if the CUI isn't on the list the path 
                    information for that CUI will not be stored

  'verbose'      -> This parameter will print out the table information 
                    to a config file in the UMLSINTERFACECONFIG directory

  'debugpath'    -> This prints out the path information to a file during
                    any of the realtime runs

You can also reconfigure these options by calling the reConfig method.

    $umls->reConfig({"forcerun"      => "1",
		     "realtime"      => "1",
		     "verbose"       => "1", 
                     "debugpath"     => "file"});

CONFIGURATION FILE

There exist a configuration files to specify which source and what relations are to be used. The default source is the Medical Subject Heading (MSH) vocabulary and the default relations are the PAR/CHD relation.

'config' -> File containing the source and relation parameters

The configuration file can be passed through the instantiation of the UMLS-Interface. Similar to passing the connection options. For example:

    $umls = UMLS::Interface->new({"driver"      => "mysql", 
				  "database"    => $database, 
				  "username"    => $opt_username,  
				  "password"    => $opt_password, 
				  "hostname"    => $hostname, 
				  "socket"      => $socket,
                                  "config"      => $configfile});

    or

    $umls = UMLS::Interface->new({"config" => $configfile});

The format of the configuration file is as follows:

SAB :: <include|exclude> <source1, source2, ... sourceN>
REL :: <include|exclude> <relation1, relation2, ... relationN>
RELA :: <include|exclude> <rela1, rela2, ... relaN> 

SABDEF :: <include|exclude> <source1, source2, ... sourceN>
RELDEF :: <include|exclude> <relation1, relation2, ... relationN>

The SAB, REL and RELA are for specifing what sources and relations should be used when traversing the UMLS. For example, if we wanted to use the MSH vocabulary with only the RB/RN relations that have been identified as 'isa' RELAs, then the configuration file would be:

SAB :: include MSH
REL :: include RB, RN
RELA :: include inverse_isa, isa

if we did not care what type of RELA the RB/RN relations were the configuration would be:

SAB :: include MSH
REL :: include RB, RN

if we wanted to use MSH and use any relation except for PAR/CHD, the configuration would be:

SAB :: include MSH
REL :: exclude PAR, CHD

The SABDEF and RELDEF are for obtaining a definition or extended definition of the CUI. SABDEF signifies which sources to extract the definition from. For example,

SABDEF :: include SNOMEDCT

would only return definitions that exist in the SNOMEDCT source. where as:

SABDEF :: exclude SNOMEDCT

would use the definitions from the entire UMLS except for SNOMEDCT. The default, if you didn't specify SABDEF at all in the configuration file, would use the entire UMLS.

The RELDEF is from the extended definition. It signifies which relations should be included when creating the extended definition of a given CUI. For example,

RELDEF :: include TERM, CUI, PAR, CHD, RB, RN

This would include in the definition the terms associated with the CUI, the CUI's definition and the definitions of the concepts related to the CUI through either a PAR, CHD, RB or RN relation. Similarly, using the exclude as in:

RELDEF :: exclude TERM, CUI, PAR, CHD, RB, RN

would use all of the relations except for the one's specified. If RELDEF is not specified the default uses all of the relations which consist of: TERM, CUI, PAR, CHD, RB, RN, RO, SYN, and SIB.

I know that TERM and CUI are not 'relations' but we needed a way to specify them and this seem to make the most sense at the time.

An example of the configuration file can be seen in the samples/ directory.

FUNCTION DESCRIPTIONS

Configuration Functions

returnTableNames

description:

returns the table names in both human readable and hex form

input:

output:

$hash <- reference to a hash containin the table names 
         in human readable and hex form

example:

my $hash = $umls->returnTableNames();

dropConfigTable

description:

removes the configuration tables

input:

output:

example:

$umls->dropConfigTable();

removeConfigFiles

description:

removes the configuration files

input:

output:

example:

$umls->removeConfigFiles();

reConfig

description:

function to re-initialize the interface configuration parameters

input:

$hash -> reference to hash containing parameters 

output:

example:

$umls->reConfig(\%parameters);

UMLS Functions

root

description:

returns the root

input:

output:

$string -> string containing the root

example:

my $root = $umls->root();

version

description:

returns the version of the UMLS currently being used

input:

output:

$version -> string containing the version

example:

my $version = $umls->version();
     

Parameter Functions

getConfigParameters

description:

returns the SAB/REL or SABDEF/RELDEF parameters set in the configuration file

input:

output:

$hash <- reference to hash containing parameters in the 
         configuration file - if there was not config
         file the hash is empty and defaults are being
         use

example:

my $hash = $umls->getConfigParameters;

getSabString

description:

returns the sab (SAB) information from the configuration file

input:

output:

$string <- containing the SAB line from the config file

example:

my $string = $umls->getSabString

getRelString

description:

returns the relation (REL) information from the configuration file

input:

output:

$string <- containing the REL line from the config file

example:

my $string = $umls->getRelString

getRelaString

description:

returns the rela (RELA) information from the configuration file

input:

output:

$string <- containing the RELA line from the config fil

example:

my $string = $umls->getRelaString

Metathesaurus Concept Functions

validCui

description:

checks to see a CUI is valid

input:

$concept <- string containing a cui

output:

0 | 1    <- integer indicating if the cui is valide

example:

my $concept = "C0018563";	
if($umls->validCui($concept)) { 
  print "$concept is valid\n";
}

exists

description:

function to check if a concept ID exists in the database.

input:

$concept <- string containing a cui

output:

1 | 0    <- integers indicating if the cui exists

example:

my $concept = "C0018563";	
if($umls->exists($concept)) { 
   print "$concept exists\n";
}

getRelated

description:

function that returns a list of concepts (@concepts) related to a concept $concept through a relation $rel

input:

$concept <- string containing cui
$rel     <- string containing a relation

output:

$array   <- reference to an array of cuis

example:

 my $concept = "C0018563";
 my $rel     = "SIB";
	     my $array   = $umls->getRelated($concept, $rel);
 foreach my $related_concept (@{$array}) { 
	  print "$related_concept\n";
 }

getPreferredTerm

description:

function that returns the preferred term of a cui from the sources specified in the configuration file

input:

$concept <- string containing cui

output:

$string  <- string containing the preferred term

example:

my $concept = "C0018563";
my $string  = $umls->getPreferredTerm($concept);
print "The preferred term of $concept is $string\n";

getAllPreferredTerm

description:

function that returns the preferred term of a cui from entire umls

input:

$concept <- string containing cui

output:

$string  <- string containing the preferred term

example:

my $concept = "C0018563";
my $string  = $umls->getPreferredTerm($concept);
print "The preferred term of $concept is $string\n";

getTermList

description:

function to map terms to a given cui from the sources specified in the configuration file using SAB

input:

$concept <- string containing cui

output:

$array   <- reference to an array of terms (strings)

example:

my $concept = "C0018563";
my $array   = $umls->getTermList($concept);
print "The terms associated with $concept are:\n";
foreach my $term (@{$array}) { print "  $term\n"; }

getDefTermList

description:

function to map terms to a given cui from the sources specified in the configuration file using SABDEF

input:

$concept <- string containing cui

output:

$array   <- reference to an array of terms (strings)

example:

my $concept = "C0018563";
my $array   = $umls->getDefTermList($concept);
print "The terms associated with $concept are:\n";
foreach my $term (@{$array}) { print "  $term\n"; }

getAllTerms

description:

function to map terms from the entire UMLS to a given cui

input:

$concept <- string containing cui

output:

$array   <- reference to an array containing terms (strings)

example:

my $concept = "C0018563";
my $array   = $umls->getAllTermList($concept);
print "The terms associated with $concept are:\n";
foreach my $term (@{$array}) { print "  $term\n"; }

getConceptList

description:

function to maps a given term to a set cuis in the sources specified in the configuration file by SAB

input:

$term  <- string containing a term

output:

$array <- reference to an array containing cuis

example:

my $term   = "hand";
my $array  = $umls->getConceptList($term);
print "The concept associated with $term are:\n";
foreach my $concept (@{$array}) { print "  $concept\n"; }

getDefConceptList

description:

function to maps a given term to a set cuis in the sources specified in the configuration file by SABDEF

input:

$term  <- string containing a term

output:

$array <- reference to an array containing cuis

example:

my $term   = "hand";
my $array  = $umls->getDefConceptList($term);
print "The concept associated with $term are:\n";
foreach my $concept (@{$array}) { print "  $concept\n"; }

getAllConcepts

description:

function to maps a given term to a set cuis all the sources

input:

$term  <- string containing a term

output:

$array <- reference to an array containing cuis

example:

my $term   = "hand";
my $array  = $umls->getAllConceptList($term);
print "The concept associated with $term are:\n";
foreach my $concept (@{$array}) { print "  $concept\n"; }

getCompounds

description:

function returns all the compounds in the sources specified in the configuration file

input:

output:

$hash <- reference to a hash containing cuis

example:

my $hash = $umls->getCompounds();
foreach my $term (sort keys %{$hash}) {
  print "$term\n";
}

getCuiList

description:

returns all of the cuis in the sources specified in the configuration file

input:

output:

$hash <- reference to a hash containing cuis

example:

my $hash = $umls->getCuiList();
foreach my $concept (sort keys %{$hash}) { 
   print "$concept\n";
}

getCuisFromSource

description:

returns the cuis from a specified source

input:

$sab   <- string contain the sources abbreviation

output:

$array <- reference to an array containing cuis

example:

my $sab   = "MSH";
my $array = $umls->getCuisFromSource($sab);
foreach my $concept (@{$array}) { 
  print "$concept\n";
}

getSab

description:

takes as input: a cui and returns all of the sources in which it originated from

input:

$concept <- string containing the cui 

output:

$array   <- reference to an array contain the sources (abbreviations)

example:

my $concept = "C0018563";	
my $array   = $umls->getSab($concept);
print "The concept ($concept) exist in sources:\n";
foreach my $sab (@{$array}) { print "  $sab\n"; }

getChildren

description:

returns the children of a concept - the relations that are considered children are predefined by the user in the configuration file. The default is the CHD relation.

input:

$concept <- string containing cui

output:

$array   <- reference to an array containing a list of cuis

example:

my $concept  = "C0018563";	
my $children = $umls->getChildren($concept);
print "The children of $concept are:\n";
foreach my $child (@{$children}) { print "  $child\n"; }

getParents

description:

returns the parents of a concept - the relations that are considered parents are predefined by the user in the configuration file.The default is the PAR relation.

input:

$concept <- string containing cui

output:

$array   <- reference to an array containing a list of cuis

example:

my $concept  = "C0018563";	
my $parents  = $umls->getParents($concept);
print "The parents of $concept are:\n";
foreach my $parent (@{$parents}) { print "  $parent\n"; }

getRelations

description:

returns the relations of a concept in the source specified by the user in the configuration file

input:

$concept <- string containing a cui

output:

$array   <- reference to an array containing strings of relations

example:

my $concept  = "C0018563";	
my $array    = $umls->getRelations($concept);
print "The relations associated with $concept are:\n";
foreach my $relation (@{$array}) { print "  $relation\n"; }

getRelationsBetweenCuis

description:

returns the relations and its source between two concepts

input:

$concept1 <- string containing a cui
$concept2 <- string containing a cui

output:

$array    <- reference to an array containing the relations

example:

my $concept1  = "C0018563";
my $concept2  = "C0016129";
my $array     = $umls->getRelationsBetweenCuis($concept1,$concept2);
print "The relations between $concept1 and $concept2 are:\n";
foreach my $relation (@{$array}) { print "  $relation\n"; }

Metathesaurus Concept Definition Fuctions

getExtendedDefinition

description:

returns the extended definition of a cui given the relation and source information in the configuration file

input:

$concept <- string containing a cui

output:

$array   <- reference to an array containing the definitions

example:

my $concept = "C0018563";	
my $array   = $umls->getExtendedDefinition($concept);
print "The extended definition of $concept is:\n";
foreach my $def (@{$array}) { print "  $def\n"; }

getCuiDef

description:

returns the definition of the cui

input:

$concept <- string containing a cui
$sabflag <- 0 | 1 whether to include the source in with the definition 

output:

$array   <- reference to an array of definitions (strings)

example:

my $concept = "C0018563";	
my $array   = $umls->getCuiDef($concept);
print "The definition of $concept is:\n";
foreach my $def (@{$array}) { print "  $def\n"; }

Metathesaurus Concept Path Functions

depth

description:

function to return the maximum depth of a taxonomy.

input:

output:

$string <- string containing the depth

example:

 my $string = $umls->depth();
	   

pathsToRoot

description:

function to find all the paths from a concept to the root node of the is-a taxonomy.

input:

$concept <- string containing cui

output:

$array   <- array reference containing the paths

example:

my $concept = "C0018563";	
my $array   = $umls->pathsToRoot($concept);
print "The paths to the root for $concept are:\n";
foreach my $path (@{$array}) { print "  $path\n"; }

findMinimumDepth

description:

function returns the minimum depth of a concept given the
sources and relations specified in the configuration file
input:   

$concept <- string containing the cui

output:

$int     <- string containing the depth of the cui

example:

my $concept = "C0018563";	
my $int     = $umls->findMinimumDepth($concept);
print "The minimum depth of $concept is $int\n";

findMaximumDepth

description:

returns the maximum depth of a concept given the sources and relations specified in the configuration file

input:

$concept <- string containing the cui

output:

$int     <- string containing the depth of the cui

example:

my $concept = "C0018563";	
my $int     = $umls->findMaximumDepth($concept);
print "The maximum depth of $concept is $int\n";

findNumberOfCloserConcepts

description:

function that finds the number of cuis closer to concept1 than concept2

input:

$concept1  <- the first concept
$concept2  <- the second concept

output:

$int <- number of cuis closer to concept1 than concept2

example:

my $concept1  = "C0018563";
my $concept2  = "C0016129";
my $int       = $umls->findNumberOfCloserConcepts($concept1,$concept2);
print "The number of closer concepts to $concept1 than $concept2 is $int\n";

findShortestPathLength

description:

function that finds the length of the shortest path

input:

$concept1  <- the first concept
$concept2  <- the second concept

output:

$int <- the length of the shortest path between them

example:

my $concept1  = "C0018563";
my $concept2  = "C0016129";
my $int       =  $umls->findShortestPathLength($concept1,$concept2);
print "The shortest path length between $concept1 than $concept2 is $int\n";

findShortestPath

description:

returns the shortest path between two concepts given the sources and relations specified in the configuration file

input:

$concept1 <- string containing the first cui
$concept2 <- string containing the second

output:

$array    <- reference to an array containing the shortest path(s)

example:

my $concept1  = "C0018563";
my $concept2  = "C0016129";
my $array     = $umls->findShortestPath($concept1,$concept2);
print "The shortest path(s) between $concept1 than $concept2 are:\n";
foreach my $path (@{$array}) { print "  $path\n"; }

findLeastCommonSubsumer

description:

 returns the least common subsummer between two concepts given 
  the sources and relations specified in the configuration file
input:   
	
 $concept1 <- string containing the first cui
 $concept2 <- string containing the second

output:

$array    <- reference to an array containing the lcs(es)

example:

my $concept1  = "C0018563";
my $concept2  = "C0016129";
my $array     = $umls->findLeastCommonSubsumer($concept1,$concept2);
print "The LCS(es) between $concept1 than $concept2 iare:\n";
foreach my $lcs (@{$array}) { print "  $lcs\n"; }

Metathesaurus Concept Propagation Functions

setPropagationParameters

description:

sets the propagation counts

input:

$hash <- reference to hash containing parameters
         debug         -> turn debug option on 
         icpropagation -> file containing icpropagation counts
         icfrequency   -> file containing icfrequency counts
         smooth        -> whether you want to smooth the frequency counts

example:

$umls->setPropagationParameters(\%hash);

getIC

description:

returns the information content of a given cui

input:

$concept <- string containing a cui

output:

$double  <- double containing its IC

example:

my $concept  = "C0018563";
my $double   = $umls->getIC($concept);
print "The IC of $concept is $double\n";

getProbability

description:

returns the probability of a given cui

input:

$concept <- string containing a cui

output:

$double  <- double containing its probability

example:

my $concept  = "C0018563";
my $double   = $umls->getProbability($concept);
print "The probability of $concept is $double\n";

getN

description:

returns the total number of CUIs (N)

input:

output:

$int  <- integer containing frequency

example:

my $int = $umls->getN();

getFrequency

description:

returns the propagation count (frequency) of a given cui

input:

$concept <- string containing a cui

output:

$double  <- double containing its frequency

example:

my $concept  = "C0018563";
my $double   = $umls->getFrequency($concept);
print "The frequency of $concept is $double\n";

getPropagationCuis

description:

returns all of the cuis to be propagated given the sources and relations specified by the user in the configuration file

input:

output:

$hash <- reference to hash containing the cuis

example:

my $hash = $umls->getPropagationCuis();

propagateCounts

description:

propagates the given frequency counts

input:

$hash <- reference to the hash containing the frequency counts

output:

$hash <- containing the propagation counts of all the cuis 
         given the sources and relations specified in the 
         configuration file

example:

my $phash = $umls->propagateCounts(\%fhash);

Semantic Network Functions

getSemanticRelation

description:

subroutine to get the relation(s) between two semantic types

input:

$st1   <- semantic type abbreviation
$st2   <- semantic type abbreviation

output:

$array <- reference to an array of semantic relation(s)

example:

my $st1   = "blor";
my $st2   = "bpoc";
my $array = $umls->getSemanticRelation($st1,$st2);
print "The relations between $st1 and $st2 are:\n";
foreach my $relation (@{$array}) { print "  $relation\n"; }

getSt

description:

returns the semantic type(s) of a given concept

input: $concept <- string containing a concept

output:

$array   <- reference to an array containing the semantic type's TUIs
            associated with the concept

example:

my $concept  = "C0018563";	
my $array    = $umls->getSts($concept);
print "The semantic types associated with $concept are:\n";
foreach my $st (@{$array}) { print "  $st\n"; }

getSemanticGroup

description:

function returns the semantic group(s) associated with the concept

input:

$concept <- string containing cuis

output:

$array   <- $array reference containing semantic groups

example:

my $concept  = "C0018563";	
my $array    = $umls->getSemanticGroup($concept);
print "The semantic group associated with $concept are:\n";
foreach my $sg (@{$array}) { print "  $sg\n"; }

stGetSemanticGroup

description:

function returns the semantic group(s) associated with a semantic type

input:

$st <- string containing semantic type abbreviations

output:

$array   <- $array reference containing semantic groups

example:

my $st  = "pboc";
my $array    = $umls->stGetSemanticGroup($st);
print "The semantic group associated with $st are:\n";
foreach my $sg (@{$array}) { print "  $sg\n"; }

getStString

description:

returns the full name of a semantic type given its abbreviation

input:

$st     <- string containing the abbreviation of the semantic type

output:

$string <- string containing the full name of the semantic type

example:

my $st     = "bpoc";
my $string = $umls->getStString($st);
print "The abbreviation $st stands for $string\n";

getStAbr

description:

returns the abreviation of a semantic type given its TUI (UI)

input:

$tui    <- string containing the semantic type's TUI

output:

$string <- string containing the semantic type's abbreviation

example:

my $tui    = "T023"
my $string = $umls->getStAbr($tui);
print "The abbreviation of $tui is $string\n";

getStTui

description:

function to get the name of a semantic type's TUI given its abbrevation

input:

$string <- string containing the semantic type's abbreviation

output:

$tui    <- string containing the semantic type's TUI

example:

my $string = "bpoc"
my $tui     = $umls->getStAbr($tui);
print "The tui of $string is $tui\n";

getStDef

description:

returns the definition of the semantic type - expecting abbreviation

input:

$st     <- string containing the semantic type's abbreviation

output:

$string <- string containing the semantic type's definition

example:

my $st     = "bpoc"
my $string = $umls->getStDef($st);
print "The definition of $st is $string\n";

Semantic Network Path Functions

stPathsToRoot

description:

This function to find all the paths from a semantic type (tui) to the root node of the is-a taxonomy in the semantic network

input:

$tui     <- string containing tui

output:

$array   <- array reference containing the paths

example:

my $tui   = "T023"
my $array = $umls->stPathsToRoot($tui);
print "The paths from $tui to the root are:\n";
foreach my $path (@{$array}) { print "  $path\n";

stFindShortestPath

description:

This function returns the shortest path between two semantic type TUIs.

input:

$st1   <- string containing the first tui
$st2   <- string containing the second tui

output:

$array <- reference to an array containing paths

example:

my $st1  = "T023";
my $st2  = "T029";
my $array     = $umls->stFindShortestPath($st1,$st2);
print "The shortest path(s) between $st1 than $st2 are:\n";
foreach my $path (@{$array}) { print "  $path\n"; }

Semantic Network Propagation Functions

loadStPropagationHash

description:

load the propagation hash for the semantic network

input:

$hash  <- reference to a hash containing probability counts

output:

example:

$umls->loadStPropagationHash(\%hash);

propagateStCounts

description:

propagates the given frequency counts of the semantic types

input:

$hash <- reference to the hash containing the frequency counts

output:

$hash <- containing the propagation counts of all the semantic types

example:

my $phash = $umls->propagateStCounts(\%fhash);

getStIC

description:

returns the information content of a given semantic type

input:

$st      <- string containing a semantic type

output:

$double  <- double containing its IC

example:

my $st = "bpoc";
my $double = $umls->getStIC($st);
print "The IC of $st is $double\n";

getStProbability

description:

returns the probability of a given semantic type

input:

$st      <- string containing a semantic type

output:

$double  <- double containing its probabilit

example:

my $st = "bpoc";
my $double = $umls->getStProbability($st);
print "The Probability of $st is $double\n";

getStN

description:

returns the total number of semantic types (N)

input:

output:

$int  <- double containing frequency

example:

my $int = $umls->getStN();

setStSmoothing

description:

function to set the smoothing parameter

input:

output:

example:

$umls->setStSmoothing();

REFERENCING

If you write a paper that has used UMLS-Interface in some way, we'd certainly be grateful if you sent us a copy and referenced UMLS-Interface. We have a published paper that provides a suitable reference:

@inproceedings{McInnesPP09,
   title={{UMLS-Interface and UMLS-Similarity : Open Source 
           Software for Measuring Paths and Semantic Similarity}}, 
   author={McInnes, B.T. and Pedersen, T. and Pakhomov, S.V.}, 
   booktitle={Proceedings of the American Medical Informatics 
              Association (AMIA) Symposium},
   year={2009}, 
   month={November}, 
   address={San Fransico, CA}
}

This paper is also found in

http://www-users.cs.umn.edu/~bthomson/publications/pubs.html

or

http://www.d.umn.edu/~tpederse/Pubs/amia09.pdf

SEE ALSO

http://tech.groups.yahoo.com/group/umls-similarity/

http://search.cpan.org/dist/UMLS-Similarity/

AUTHOR

Bridget T McInnes <bthomson@cs.umn.edu> Ted Pedersen <tpederse@d.umn.edu>

COPYRIGHT

Copyright (c) 2007-2009
Bridget T. McInnes, University of Minnesota
bthomson at cs.umn.edu

Ted Pedersen, University of Minnesota Duluth
tpederse at d.umn.edu

Siddharth Patwardhan, University of Utah, Salt Lake City
sidd at cs.utah.edu

Serguei Pakhomov, University of Minnesota Twin Cities
pakh0002 at umn.edu

Ying Liu, University of Minnesota
liux0935 at umn.edu

This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version.

This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.

You should have received a copy of the GNU General Public License along with this program; if not, write to

The Free Software Foundation, Inc.,
59 Temple Place - Suite 330,
Boston, MA  02111-1307, USA.