NAME

Class::DBI::Audit - Audit changes to columns in CDBI objects.

SYNOPSIS

  # Base class
  package Music::DBI;
  use base 'Class::DBI';
  use mixin 'Class::DBI::Audit';  
  Music::DBI->connection('dbi:mysql:dbname', 'username', 'password');
  __PACKAGE__->auditColumns({
      remote_user => [ from_hash => 
          { name => 'ENV', key => 'REMOTE_USER' } ], 
      time_stamp  => [ from_sub => 
          { subroutine => sub { scalar localtime; } } ]
  });

  # Derived class
  package Music::Artist;
  use base 'Music::DBI';
  __PACKAGE__->table('artist');
  __PACKAGE__->auditTable('artist_audit');
  __PACKAGE__->columns(All   => qw/artistid first_name last_name/); 
  __PACKAGE__->columns(Audit => qw/first_name last_name/); 
  __PACKAGE__->add_audit_triggers;
  
  # (or everything can go in the base or derived class, if you want)

/* 
 * Now create an audit table, to track changes to first + last names
 * of artists :
 */ 
 create table artist_audit (
     id           int unsigned NOT NULL auto_increment primary key,

     /* These 5 columns are mandatory */
     parent_id    int unsigned NOT NULL, 
     query_type   enum('update','insert','delete'),  
     column_name  varchar(255),
     value_before blob,
     value_after  blob,

     /* The rest reflect auditColumns (above) */
     time_stamp   datetime,
     remote_user  varchar(255)
  );

  # Then in your main program :

  $ENV{REMOTE_USER} = 'Puff'
  $artist = Music::Artist->insert({ 
          first_name => 'Jennifer', 
          last_name => 'Lopez' });

  $ENV{REMOTE_USER} = 'Ben'
  $artist->first_name('J');
  $artist->last_name('Lo');
  $artist->update;

  for my $column (qw(first_name last_name)) {
      for ($artist->column_history($column)) {
          print $_->{remote_user}.
                " set $column to ".
                $_->{value_after}.
                "\n";
      }
  }
  # Puff set first_name to Jennifer
  # Ben set first_name to J
  # Puff set last_name to Jennifer
  # Ben set last_name to Lo

DESCRIPTION

This module allows easy tracking of changes to values in tables managed by CDBI classes. It helps you answer the question "who set that value to be 'foobar', I thought I set it to be 'farbar'?" without resorting to digging through snapshots of your database tables and comparing them to your webserver's http logs.

Use this module as a mixin with either your base CDBI class, or a derived one, and the following methods will be added to your class (or classes) :

auditTable()
auditColumns()
add_audit_triggers()
column_history()

The first two specify the external audit table, ('artist_audit' above), and the columns of this table (time_stamp and remote_user above).

The third method adds the necessary triggers to your CDBI class which will track the changes, writing them to the auditTable.

The fourth returns a history of changes to a column (i.e. the data from the audit table) as an array of hashrefs.

Only columns in the 'Audit' group are audited. Set this like so :

__PACKAGE__->columns(Audit => qw/first_name last_name);

You can use either one huge audit table for all of the classes you wish to audit (in which case you'll want 'table' to be an element of the auditColumns, see below), or you can have separate audit tables for each class. Or some combination. Since audit tables get big quickly, you'll probably want several tables.

METHODS

auditColumns

Set this class data to be a hash which specifies what goes in your audit table, e.g.

__PACKAGE__->auditColumns({
    # hash from column name to where it comes from
    remote_addr => [ from_hash   => { name => 'ENV', 
                                      key => 'REMOTE_ADDR' } ], 
    remote_user => [ from_hash   => { name => 'ENV', 
                                      key => 'REMOTE_USER' } ], 
    request_uri => [ from_hash   => { name => 'ENV', 
                                      key => 'REQUEST_URI' } ],
    command     => [ from_scalar => { name => '0',    } ], 
    table       => [ from_method => { name => 'table' } ],
    time_stamp  => [ from_sub    => { subroutine => sub { 
                    strftime("%Y-%m-%d %H:%M:%S",localtime) 
                    } } ]
});

...means store these values :

$ENV{REMOTE_ADDR},
$ENV{REMOUTE_USER},
$ENV{REQUEST_URI},
$0,
$self->table,
the value returned by the anonymous subroutine 
    sub { strftime("%Y-%m-%d %H:%M:%S",localtime)  }

in columns named remote_addr, remote_user, remote_uri, command, table, and time_stamp respectively.

from_hash and from_scalar columns both look in the 'main::' symbol table for their variables, override this with a 'package' entry if desired.

auditTable

By default the audit table is the name of the CDBI table with '_audit' appended to the end. Change this by calling auditTable(). If multiple tables are using the same database table for auditing, you'll want to give 'table' as one of the methods in auditColumns (so you can tell what table a row in the audit table refers to).

add_audit_triggers

Adds all the triggers below to a class.

after_create

A subroutine to be used in the after_create trigger.

before_update

A subroutine to be used in the before_update trigger.

after_update

To be used in the after_update trigger.

before_delete

To be used in the before_delete trigger.

column_history

Fetch the history of a column from the audit table. Returns an array of hashrefs whose keys correspond to the values in the audit table.

NOTES

Data in the audit table is always added, never deleted or changed. Some databases may be optimized for such tables (e.g. the MySQL "archive" engine)

If a field with just whitespace is changed to another field with just whitespace, this is ignored. (But NULLs changing to not NULLs and vice versa are logged.)

If a field that looks like a number is changed to another one that looks like a number with the same value, this is ignored. See _values_differ() in the source code.

Most likely, value_before and value_after will have some redundancy (since the next value_before should be the previous value_after); this is intentional, since it'll cause any non-audited changes to the cdbi table to show up.

All the triggers get data directly from the database using the primary key + table + primary key value. This is to avoid side effects (e.g. accidentally populating some fields of the object), and to ensure that the audit tables contain a record of the actual data in the table, rather than anything in memory, or anything that was inflated or filtered via select triggers.

TODO

Provide a mechanism for overriding _values_differ().