NAME

CohortExplorer::Datasource - CohortExplorer datasource superclass

SYNOPSIS

# The code below shows methods your datasource class overrides;

package CohortExplorer::Application::REDCap::Datasource;
use base qw( CohortExplorer::Datasource );

sub authenticate { 
    
    my ($self, $opts) = @_;
            
    # authentication code...

      return $response
    
}

sub default_parameters {
    
     my ($self, $opts, $response) = @_;
      
     
     # get database handle (i.e., $self->dbh()) and run some SQL queries to get additional parameters
     # or, simply add some more parameters without querying the database
     
     return $default;
}

sub entity_structure {
     
     my ($self) = @_;
     
     my %struct = (
                  -columns =>  {
                                 entity_id => "rd.record",
                                 variable => "rd.field_name",
                                 value => "rd.value",
                                 table => "rm.form_name"
                   },
                   -from =>  [ -join => qw/redcap_data|rd <=>{project_id=project_id} redcap_metadata|rm/ ],
                   -where =>  { 
                                 "rd.project_id" => $self->project_id()
                    }
      );

      $struct{-columns}{visit} =  'rd.event_id-' . $self->init_event_id()  
      if ( $self->type() eq 'longitudinal');
     
      return \%struct;
 }
 
     
sub table_structure {
     
     my ($self) = @_;
     
     return {
             
              -columns => {
                             table => "GROUP_CONCAT( DISTINCT form_name )", 
                             variable_count => "COUNT( field_name )",
                             label => "element_label"
              },
             -from  => "redcap_metadata",
             -where => {
                         "project_id" => $self->project_id()
              },
             -order_by => "field_order",
             -group_by => "form_name"
    };
 }
 
 sub variable_structure {
     
     my ($self) = @_;
     
     return {
             -columns => {
                           variable => "field_name",
                           table => "form_name",
                           label => "element_label",
                           type => "IF( element_validation_type IS NULL, 'text', element_validation_type)",
                           category => "IF( element_enum like '%, %', REPLACE( element_enum, '\\\\n', '\n'), '')"
             },
            -from => "redcap_metadata",
            -where => { 
                         "project_id" => $self->project_id()
             },
            -order_by => "field_order"
    };
 }
 
 sub datatype_map {
    
  return {
              'int'         => 'signed',
             'float'        => 'decimal',
             'date_dmy'     => 'date',
             'date_mdy'     => 'date',
             'date_ymd'     => 'date',
             'datetime_dmy' => 'datetime'
  };
}

CONCEPTS

CohortExplorer::Datasource is an abstract factory; initialise() is the factory method that constructs and returns an object of the datasource supplied as an application option. This class reads the datasource configuration from the config file (default /etc/CohortExplorer/datasource-config.properties) to instantiate the datasource object. The config file takes the format below,

 <datasource Clinical> 
  namespace=Opal
  type=longitudinal
  static_tables=Demographics,FamilyHistory
  url=myhost
  dsn=DBI:mysql:database=opal;host=myhost;port=3306
  username=yourusername
  password=yourpassword
</datasource> 

<datasource Clinical1> 
  namespace=Opal
  type=longitudinal
  id_visit_separator=_
  name=Clinical
  url=myhost
  dsn=DBI:mysql:database=opal;host=myhost;port=3306
  username=yourusername
  password=yourpassword
</datasource> 

<datasource Drugs> 
  namespace=REDCap
  dsn=DBI:mysql:database=opal;host=myhost;port=3306
  username=yourusername
  password=yourpassword
</datasource>

Each blocks holds a unique datasource configuration. Apart from some reserved parameters, namespace, dsn, username and password it is up to the user to decide what parameters they want to include in the configuration file. The user can specify the actual name of the datasource using the name parameter provided the block name is an alias. If the name parameter is not found then the block name is assumed to be the actual name of the datasource. In the example above, both Clinical and Clinical1 connect to the same datasource (i.e. Clinical) but with different configurations. Once CohortExplorer::Datasource has successfully instantiated the datasource object the user can access the parameters by simply calling the methods which have the same name as the parameters. For example, the database handle can be retrieved by $self->dbh() and id_visit_separator by $self->id_visit_separator(). The namespace is the name of the repository housing the datasource.

PROCESSING

After instantiating the datasource object this class attempts to perform the following operations:

  1. Authenticates the user. The subsequent steps are only performed if the authentication is successful.

  2. Loads default parameters (if any).

  3. Validates the return from entity_structure and attempts to set entity_count and visit_max (for longitudinal datasources only).

  4. Validates the return from table_structure and attempts to set tables and its attributes.

  5. Validates the return from variable_structure and attempts to set variable and its attributes. The method also maps the variable types to their SQL types (default char(255)) only if datatype_map is overidden.

  6. Sets visit variables for the longitudinal datasources. The visit variables are valid to dynamic tables only and they represents the visit transformation of variables e.g., V1.Var, V2.Var ... Vmax.Var, Vany.Var and Vlast.Var. The prefix V1 represents first visit of the variable 'var', V2 represents the second visit, Vany implies any visit and Vlast last visit.

Subclass Hooks

The sub classes override the following hooks:

authenticate( $opts )

This method should return a response (a scalar) upon successful authentication otherwise undef. The method is called with one parameter, $opts which is a hash with application options as keys and their user-provided values as hash values.

default_parameters( $opts, $response )

This method should return a hash ref containing parameter name-value pairs. The user can run some SQL queries in case the parameters to be added to the datasource object first need to be retrieved from the database. The parameters used in calling this method are:

$opts is a hash with application options as keys and their user-provided values as hash values.

$response is the response received after successful authentication.

Note that this method and the methods below are only called if the authentication is successful.

entity_structure()

The method should return a hash ref defining the entity structure in the database. The hash ref must have the following keys:

-columns

entity_id

variable

value

table

visit (only required for longitudinal datasources)

-from

table specifications see SQL::Abstract::More

-where

where clauses see SQL::Abstract

table_structure()

The method should return a hash ref defining the table structure in the database. The table in this context implies questionnaires or forms. For example,

{
    -columns => {
                  table => "GROUP_CONCAT( DISTINCT form_name )", 
                  variable_count => "COUNT( field_name )",
                  label => "element_label"
    },
   -from  => "redcap_metadata",
   -where => {
               "project_id" => $self->project_id()
   },
  -order_by => "field_order",
  -group_by => "form_name"

};

the user should make sure the returned hash ref is able to produce the SQL output like the one below,

+-------------------+-----------------+------------------+
| table             | variable_count  | label            |
+-------------------+-----------------+------------------+
| demographics      |              26 | Demographics     |
| baseline_data     |              19 | Baseline Data    |
| month_1_data      |              20 | Month 1 Data     |
| month_2_data      |              20 | Month 2 Data     |
| month_3_data      |              28 | Month 3 Data     |
| completion_data   |               6 | Completion Data  |
'-------------------+-----------------+------------------'

Note that -columns hash ref must have the key table corresponding to form/questionnaire names and others columns can be table attributes. It is up to the user to decide what table attributes they think are suitable for table description.

variable_structure()

This method should return a hash ref defining the variable structure in the database. For example,

{
    -columns => {
                   variable => "field_name",
                   table => "form_name",
                   label => "element_label"
                   type => "IF( element_validation_type IS NULL, 'text', element_validation_type)",
                   category => "IF( element_enum like '%, %', REPLACE( element_enum, '\\\\n', '\n'), '')",
    },
   -from => "redcap_metadata",
   -where => { 
               "project_id" => $self->project_id()
    },
    -order_by => "field_order"
};

the user should make sure the returned hash ref is able to produce the SQL output like the one below,

+---------------------------+---------------+-------------------------+---------------+----------+
| variable                  | table         |label                    | category      | type     |
+---------------------------+---------------+-------------------------+---------------------------
| kt_v_b                    | baseline_data | Kt/V                    |               | float    |
| plasma1_b                 | baseline_data | Collected Plasma 1?     | 0, No         | text     |
|                           |               |                         | 1, Yes        |          |
| date_visit_1              | month_1_data  | Date of Month 1 visit   |               | date_ymd |
| alb_1                     | month_1_data  | Serum Albumin (g/dL)    |               | float    |
| prealb_1                  | month_1_data  | Serum Prealbumin (mg/dL)|               | float    |
| creat_1                   | month_1_data  | Creatinine (mg/dL)      |               | float    |
+---------------------------+---------------+-----------+-------------------------------+--------+

Note that -columns hash ref must have the key variable and table. Again it is up to the user to decide what variable attributes (i.e. meta data) they think define the variables in the datasource. The categories in category should be separated by newline.

datatype_map()

This method should return a hash ref with variable type as keys and equivalent SQL type (i.e., castable) as value.

SECURITY

When setting-up CohortExplorer for group usage it is advised to install the application using its debian package which is part of the release. The package greatly simplifies the installation and implements the security mechanism. The security measures include:

  • forcing taint mode and,

  • disabling the access to configuration files and log file to users other than the administrator or root (user).

DIAGNOSTICS

  • The configuration file is unable to be parsed by Config::General.

  • Failed to instantiate datasource package '<datasource pkg>' via new().

  • The return from methods default_paramters, entity_structure, table_structure, variable_structure is either not hash worthy or incomplete.

  • SQL::Abstract::More is unable to construct the SQL query using the supplied hash ref.

  • DBI is unable to execute the SQL query.

DEPENDENCIES

Carp

CLI::Framework::Exceptions

Config::General

DBI

Exception::Class::TryCatch

SQL::Abstract::More

Tie::IxHash

SEE ALSO

CohortExplorer

CohortExplorer::Application::Opal::Datasource

CohortExplorer::Application::REDCap::Datasource

CohortExplorer::Application::Command::Describe

CohortExplorer::Application::Command::Find

CohortExplorer::Application::Command::History

CohortExplorer::Application::Command::Query::Search

CohortExplorer::Application::Command::Query::Compare

LICENSE AND COPYRIGHT

Copyright (c) 2013-2014 Abhishek Dixit (adixit@cpan.org). All rights reserved.

This program is free software: you can redistribute it and/or modify it under the terms of either:

  • the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version, or

  • the " Artistic Licence ".

AUTHOR

Abhishek Dixit