NAME

SPOPS -- Simple Perl Object Persistence with Security

SYNOPSIS

# Define an object completely in a configuration file
my $spops = { 
  myobject => {
   class => 'MySPOPS::Object',
   isa => qw( SPOPS::DBI ),
   ...
  }, ...
};

# Process the configuration:
SPOPS::Configure->process_config( { config => $spops } );

# Initialize the class
MySPOPS::Object->class_initialize;

# create the object
my $object = MySPOPS::Object->new;

# Set some parameters
$object->{ $param1 } = $value1;
$object->{ $param2 } = $value2;

# Store the object in an inherited persistence mechanism
eval { $object->save };
if ( $@ ) {
  my $err_info = SPOPS::Error->get;
  die "Error trying to save object:\n",
      "$err_info->{user_msg}\n",
      "$err_info->{system_msg}\n";
}

OVERVIEW

SPOPS -- or Simple Perl Object Persistence with Security -- allows you to easily define how an object is composed and save, retrieve or remove it any time thereafter. It is intended for SQL databases (using the DBI), but you should be able to adapt it to use any storage mechanism for accomplishing these tasks. (An early version of this used GDBM, although it was not pretty.)

The goals of this package are fairly simple:

  • Make it easy to define the parameters of an object

  • Make it easy to do common operations (fetch, save, remove)

  • Get rid of as much SQL as possible, but...

  • ... do not impose a huge cumbersome framework on the developer

  • Make applications easily portable from one database to another

  • Include flexibility to allow extensions

  • Let people simply issue SQL statements and work with normal datasets if they want

So this is a class from which you can derive several useful methods. You can also abstract yourself from a datasource and easily create new objects.

The subclass is responsible for serializing the individual objects, or making them persistent via on-disk storage, usually in some sort of database. See "Object Oriented Perl" by Conway, Chapter 14 for much more information.

The individual objects or the classes should not care how the objects are being stored, they should just know that when they call fetch() with a unique ID that the object magically appears. Similarly, all the object should know is that it calls save() on itself and can reappear at any later date with the proper invocation.

Tie Interface

This version of SPOPS supports using a tie interface to get and set the individual data values. You can also use the more traditional OO get and set operators, but most people will likely find the hashref interface easier to deal with. (It also means you can interpolate data into strings: bonus!) Examples are given below.

The tie interface allows the most common operations -- fetch data and put it into a data structure for later use -- to be done very easily. It also hides much of the complexity behind the object for you so that most of the time you are dealing with a simple hashref.

What do the objects look like?

Here is an example getting values from CGI.pm and saving an object:

my $q = new CGI;
my $obj = MyUserClass->new();
foreach my $field ( qw( f_name l_name birthdate ) ) {
  $obj->{ $field } = $q->param( $field );
}
my $object_id = eval { $obj->save };
if ( $@ ) {
 ... report error information ...
}
else {
  warn " Object saved with ID: $obj->{object_id}\n";
}

You can now retrieve it later using the object_id:

my $obj = MyUserClass->fetch( $object_id );
print "First Name: $obj->{f_name}\n",
      "Last Name:  $obj->{l_name}\n",
      "Birthday:   $obj->{birthdate}\n";

You can also associate objects to other objects:

my $primary_group = $user->group;
print "Group Name: $primary_group->{name}\n";

And you can fetch batches of objects at once:

my $user_list = MyUserClass->fetch_group( { where => 'l_name LIKE ?',
                                            value => [ 'w%' ],
                                            order => 'birthdate' } );
foreach my $user ( @{ $user_list } ) {
  print " $user->{f_name} $user->{l_name} -- $user->{birthdate}\n";
}

EXAMPLES

# Retrieve all themes and print a description
my $themes = eval { $theme_class->fetch_group( { order => 'title' } ) };
if ( $@ ) { ... report error ... }
else {
  foreach my $thm ( @{ $themes } ) {
    print "Theme: $thm->{title}\n",
          "Description: $thm->{description}\n";
  }
}

# Create a new user, set some values and save
my $user = $user_class->new;
$user->{email} = 'mymail@user.com';
$user->{first_name} = 'My';
$user->{last_name}  = 'User';
my $user_id = eval { $user->save };
if ( $@ ) {
  print "There was an error: ", $R->error->report(), "\n";
}

# Retrieve that same user from the database
my $user_id = $cgi->param( 'user_id' );
my $user = eval { $user_class->fetch( $user_id ) };
if ( $@ ) { ... report error ... }
else {
  print "The user's first name is: $user->{first_name}\n";
}

my $data = MyClass->new( { field1 => 'value1', field2 => 'value2' } );

# Retrieve values using the hashref
print "The value for field2 is: $data->{field2}\n";

# Set values using the hashref
$data->{field3} = 'value3';

# Save the current data state
eval { $data->save };
if ( $@ ) { ... report error ... }

# Remove the object permanently
eval { $data->remove };
if ( $@ ) { ... report error ... }

# Call arbitrary object methods to get other objects
my $other_obj = eval { $data->call_to_get_other_object() };
if ( $@ ) { ... report error ... }

# Clone the object with an overridden value and save
my $new_data = $data->clone( { field1 => 'new value' } );
eval { $new_data->save };
if ( $@ ) { ... report error ... }

# $new_data is now its own hashref of data --
# explore the fields/values in it
while ( my ( $k, $v ) = each %{ $new_data } ) {
  print "$k == $v\n";
}

# Retrieve saved data
my $saved_data = eval { MyClass->fetch( $id ) };
if ( $@ ) { ... report error ... }
else {
  while ( my ( $k, $v ) = each %{ $saved_data } ) {
    print "Value for $k with ID $id is $v\n";
  }
}

# Retrieve lots of objects, display a value and call a 
# method on each
my $data_list = eval { MyClass->fetch_group( where => "last_name like 'winter%'" ) };
if ( $@ ) { ... report error ... }
else {
  foreach my $obj ( @{ $data_list } ) {
    print "Username: $obj->{username}\n";
    $obj->increment_login();
  }
}

DESCRIPTION

This module is meant to be overridden by a class that will implement persistence for the SPOPS objects. This persistence can come by way of flat text files, LDAP directories, GDBM entries, DBI database tables -- whatever. The API should remain the same.

Class Hierarchy

SPOPS (Simple Perl Object Persistence with Security) provides a framework to make your application objects persistent (meaning, you can store them somewhere, e.g., in a relational database), and to control access to them (the usual user/group access rights stuff). You will usually just configure SPOPS by means of configuration files, and SPOPS will create the necessary classes and objects for your application on the fly. You can of course have your own code implement your objects - extending the default SPOPS object behavior with your methods. However, if SPOPS shall know about your classes and objects, you will have to tell it -- by configuring it.

The typical class hierarchy for an SPOPS object looks like this:

 --------------------------
|SPOPS                     |
 --------------------------
            ^
            |
 --------------------------
|SPOPS::MyStorageTechnology|
 --------------------------
            ^
            |
 --------------------------
|SPOPS::MyApplicationClass |
 --------------------------
SPOPS

Abstract base class, provides persistency and security framework (fetch, save, remove)

Example: You are reading it now!

SPOPS::MyStorageTechnology

Concrete base class, provides technical implementation of framework for a particular storage technology (e.g., Filesystem, RDBMS, LDAP, ... )

Example: SPOPS::DBI, SPOPS::GDBM, ...

SPOPS::MyApplicationClass

User class, provides semantic implementation of framework (configuration of parent class, e.g., database connection strings, field mappings, ... )

Example: MyApplication::User, MyApplication::Document, ...

SPOPS Object States

Basically, each SPOPS object is always in one of two states:

  • Runtime State

  • Persistency State

In Runtime State, the object representation is based on a hash of attributes. The object gets notified about any changes to it through the tie(3) mechanism.

In Persistency State, the object exists in some persistent form, that is, it is stored in a database, or written out to a file.

You can control what happens to the object when it gets written to its persistent form, or when it is deleted, or fetched from its storage form, by implementing a simple API: fetch(), save(), remove().

 -------------         save, remove         -----------------
|Runtime State|     ------------------->   |Persistency State|
 -------------      <------------------     -----------------
                          fetch

Around the fetch(), save(), and remove() calls, you can execute helper functions (pre_fetch(), post_fetch(), pre_save(), post_save(), pre_remove(), post_remove()), in case you need to prepare anything or clean up something, according to needs of your storage technology. These are pushed on a queue based on a search of @ISA, and executed front to end of the queue. If any of the calls in a given queue returns a false value, the whole action (save, remove, fetch) is short-circuited (that is, a failing method bombs out of the action). More information on this is in "Data Manipulation Callbacks: Rulesets" below.

API

The following includes methods within SPOPS and those that need to be defined by subclasses.

In the discussion below, the following holds:

  • When we say base class, think SPOPS

  • When we say subclass, think of SPOPS::DBI for example

Onward!

Also see the "ERROR HANDLING" section below on how we use die() to indicate an error and where to get more detailed infromation.

new( [ \%initialize_data ] )

Implemented by base class.

This method creates a new SPOPS object. If you pass it key/value pairs the object will initialize itself with the data (see initialize() for notes on this).

Note that you can use the key 'id' to substitute for the actual parameter name specifying an object ID. For instance:

my $uid = $user->id;
if ( eval { $user->remove } ) {
  my $new_user = MyUser->new( { id => $uid, fname = 'BillyBob' ... } );
  ...
}

In this case, we do not need to know the name of the ID field used by the MyUser class.

Returns on success: a tied hashref object with any passed data already assigned.

Returns on failure: undef.

Examples:

# Simplest form...
my $data = MyClass->new();

# ...with initialization
my $data = MyClass->new( { balance => 10532, 
                           account => '8917-918234' } );

clone( \%params )

Returns a new object from the data of the first. You can override the original data with that in the \%params passed in.

Examples:

# Create a new user bozo
my $bozo = $user_class->new;
$bozo->{first_name} = 'Bozo';
$bozo->{last_name}  = 'the Clown';
$bozo->{login_name} = 'bozosenior';
eval { $bozo->save };
if ( $@ ) { ... report error .... }

# Clone bozo; first_name is 'Bozo' and last_name is 'the Clown',
# as in the $bozo object, but login_name is 'bozojunior'
my $bozo_jr = $bozo->clone( { login_name => 'bozojunior' } );
eval { $bozo_jr->save };
if ( $@ ) { ... report error ... }

initialize()

Implemented by base class, although it is often overridden.

Cycle through the parameters and set any data necessary. This allows you to construct the object with existing data. Note that the tied hash implementation ensures that you cannot set infomration as a parameter unless it is in the field list for your class. For instance, passing the information:

firt_name => 'Chris'

should likely not set the data, since 'firt_name' is the misspelled version of the defined field 'first_name'.

fetch( $oid, [ \%params ] )

Implemented by subclass.

This method should be called from either a class or another object with a named parameter of 'id'.

Returns on success: a SPOPS object.

Returns on failure: undef; if the action failed (incorrect fieldname in the object specification, database not online, database user cannot select, etc.) a die() will be used to raise an error.

The \%params parameter can contain a number of items -- all are optional.

Parameters:

<datasource>
  For most SPOPS implementations, you can pass the data source (a DBI
  database handle, a GDBM tied hashref, etc.) into the routine.

data
  You can use fetch() not just to retrieve data, but also to do the
  other checks it normally performs (security, caching, rulesets,
  etc.). If you already know the data to use, just pass it in using
  this hashref. The other checks will be done but not the actual data
  retrieval. (See the C<fetch_group> routine in L<SPOPS::DBI> for an
  example.)

skip_security
  A true value skips security checks.

skip_cache
  A true value skips any use of the cache.

In addition, specific implementations may allow you to pass in other parameters. (For example, you can pass in 'field_alter' to the SPOPS::DBI implementation so you can format the returned data.)

Example:

my $id = 90192;
my $data = eval { MyClass->fetch( $id ) };

# Read in a data file and retrieve all objects matching IDs
my @object_list = ();
while ( <DATA> ) {
  chomp;
  next if ( /\D/ );
  my $obj = eval { ObjectClass->fetch( $_ ) };
  if ( $@ ) { ... report error ... }
  else      { push @object_list, $obj  if ( $obj ) }
}

save( [ \%params ] )

Implemented by subclass.

This method should save the object state in whatever medium the module works with. Note that the method may need to distinguish whether the object has been previously saved or not -- whether to do an add versus an update. See the section "TRACKING CHANGES" for how to do this. The application should not care whether the object is new or pre-owned.

Returns on success: the ID of the object if applicable, otherwise a true value;

Returns on failure: undef, and a die() to indicate that the action failed.

Example:

my $rv = eval { $obj->save };
if ( $@ ) {
  warn "Save of ", ref $obj, " did not work properly!";
}

Parameters:

<datasource>
  For most SPOPS implementations, you can pass the data source (a DBI
  database handle, a GDBM tied hashref, etc.) into the routine.

is_add
  A true value forces this to be treated as a new record.

skip_security
  A true value skips the security check.

skip_cache
  A true value skips any caching.

skip_log
  A true value skips the call to 'log_action'

remove()

Implemented by subclass.

Permanently removes the object, or if called from a class removes the object having an id matching the named parameter of 'id'.

Returns: status code based on success (undef == failure).

Parameters:

<datasource>
  For most SPOPS implementations, you can pass the data source (a DBI
  database handle, a GDBM tied hashref, etc.) into the routine.

Examples:

# First fetch then remove
my $obj = MyClass->fetch( $id );
my $rv = $obj->remove();

TRACKING CHANGES

The object tracks whether any changes have been made since it was instantiated and keeps an internal toggle switch. You can query the toggle or set it manually.

$obj->changed();

Returns 1 if there has been change, undef if not.

$obj->has_change();

Sets the toggle to true.

$obj->clear_change();

Sets the toggle to false.

Example:

if ( $obj->changed() ) {
  my $rv = $obj->save();
}

Note that this can (and should) be implemented within the subclass, so you as a user can simply call:

$obj->save();

And not worry about whether it has been changed or not. If there has been any modification, the system will save it, otherwise it will not.

Automatically Created Accessors

In addition to getting the data for an object through the hashref method, you can also get to the data with accessors named after the fields.

For example, given the fields:

$user->{f_name}
$user->{l_name}
$user->{birthday}

You can call to retrieve the data:

$user->f_name();
$user->l_name();
$user->birthday();

Note that this is only to read the data, not to change it. The system does this using AUTOLOAD, and after the first call it automatically creates a subroutine in the namespace of your class which handles future calls so there is no need for AUTOLOAD on the second or future calls.

DATA ACCESS METHODS

Most of this information can be accessed through the CONFIG hashref, but we also need to create some hooks for subclasses to override if they wish. For instance, language-specific objects may need to be able to modify information based on the language abbreviation.

We have simple methods here just returning the basic CONFIG information. The following are defined:

  • lang ( $ )

    Returns a language code (e.g., 'de' for German; 'en' for English). This only works if defined by your class.

  • no_cache ( bool )

    Returns a boolean based on whether this object can be cached or not. This does not mean that it will be cached, just whether the class allows its objects to be cached.

  • field( \% )

    Returns a hashref (which you can sort by the values if you wish) of fieldnames used by this class.

  • field_list( \@ )

    Returns an arrayref of fieldnames used by this class.

  • timestamp_field ( $ )

    Returns a fieldname used for the timestamp. Having a blank or undefined value for this is ok. But if you do define it, your UPDATEs will be checked to ensure that the timestamp values match up. If not, the system will throw an error. (Note, this is not yet implemented.)

Subclasses can define their own where appropriate.

"GLOBALS"

These objects are tied together by just a few things:

global_config

A few items sprinkled throughout the SPOPS hierarchy need information provided in a configuration file. See SPOPS::Configure for more information about what should be in it, what form it should take and some of the nifty tricks you can do with it.

Returns: a hashref of configuration information.

global_cache

A caching object. If you have

{cache}->{SPOPS}->{use}

in your configuration set to '0', then you do not need to worry about this. Otherwise, the caching module should implement:

The method get(), which returns the property values for a particular object.

$cache->get( { class => 'SPOPS-class', id => 'id' } )

The method set(), which saves the property values for an object into the cache.

$cache->set( { data => $spops_object } );

This is a fairly simple interface which leaves implementation pretty much wide open.

Note that subclasses may also have items that must be accessible to all children -- see SPOPS::DBI and the global_db_handle method.

DATA MANIPULATION CALLBACKS: RULESETS

When a SPOPS object calls fetch/save/remove, the base class takes care of most of the details for retrieving and constructing the object. However, sometimes you want to do something more complex or different. Each data manipulation method allows you to define two methods to accomplish these things. One is called before the action is taken (usually at the very beginning of the action) and the other after the action has been successfully completed.

What kind of actions might you want to accomplish? Cascading deletes (when you delete one object, delete a number of other dependent objects as well); dependent fetches (when you fetch one object, fetch all its component objects as well); implement a consistent data layer (such as full-text searching) by sending all inserts and updates to a separate module or daemon. Whatever.

Each of these actions is a rule, and together they are rulesets. There are some fairly simple guidelines to rules:

  1. Each rule is independent of every other rule. Why? Rules for a particular action may be executed in an arbitrary order. You cannot guarantee that the rule from one class will execute before the rule from a separate class.

  2. A rule should not change the data of the object on which it operates. Each rule should be operating on the same data. And since guideline 1 states the rules can be executed in any order, changing data for use in a separate rule would create a dependency between them.

  3. If a rule fails, then the action is aborted. This is central to how the ruleset operates, since it allows inherited behaviors to have a say on whether a particular object is fetched, saved or removed.

For example, you may want to implement a 'layer' over certain classes of data. Perhaps you want to collect how many times users from various groups visit a set of objects on your website. You can create a fairly simple class that puts a rule into the ruleset of its children that creates a log entry every time a particular object is fetch()ed. The class could also contain methods for dealing with this information.

This rule is entirely separate and independent from other rules, and does not interfere with the normal operation except to add information to a separate area of the database as the actions are happening. In this manner, you can think of them as a trigger as implemented in a relational database. However, triggers can (and often do) modify the data of the row that is being manipulated, whereas a rule should not.

pre_fetch_action( { id => $ } )

Called before a fetch is done, although if an object is retrieved from the cache this action is skipped. The only argument is the ID of the object you are trying to fetch.

post_fetch_action( \% )

Called after a fetch has been successfully completed, including after a positive cache hit.

pre_save_action( { is_add =>; bool } )

Called before a save has been attempted. If this is an add operation (versus an update), we pass in a true value for the 'is_add' parameter.

post_save_action( { is_add => bool } )

Called after a save has been successfully completed. If this object was just added to the data store, we pass in a true value for the 'is_add' parameter.

pre_remove_action( \% )

Called before a remove has been attempted.

post_remove_action( \% )

Called after a remove has been successfully completed.

FAILED ACTIONS

If an action fails, the 'fail' method associated with that action is triggered. This can be a notification to an administrator, or saving the data in the filesystem after a failed save.

fail_fetch()

Called after a fetch has been unsuccessful.

fail_save()

Called after a save has been unsuccessful.

fail_remove()

Called after a remove has been unsuccessful.

CACHING

SPOPS has object caching built-in. As mentioned above, you will need to define a global_cache either in your SPOPS object class one of its parents. Typically, you will put the stash class in the @ISA of your SPOPS object.

pre_cache_fetch()

Called before an item is fetched from the cache; if this is called, we know that the object is in the cache, we just have not retrieved it yet.

post_cache_fetch()

Called after an item is successfully retrieved from the cache.

pre_cache_save()

Called before an object has been cached.

post_cache_save()

Called after an object has been cached.

pre_cache_remove()

Called before an object is removed from the cache.

post_cache_remove()

Called after an object is successfully removed from the cache.

OTHER INDIVIDUAL OBJECT METHODS

get( $param_name )

Returns the currently stored information within the object for $param.

my $value = $obj->get( 'username' );
print "Username is $value";

It might be easier to use the hashref interface to the same data, since you can inline it in a string:

print "Username is $obj->{username}";

You may also use a shortcut of the parameter name as a method call for the first instance:

my $value = $obj->username();
print "Username is $value";

set( $param_name, $value )

Sets the value of $param to $value. If value is empty, $param is set to undef.

$obj->set( 'username', 'ding-dong' );

Again, you can also use the hashref interface to do the same thing:

$obj->{username} = 'ding-dong';

Note that unlike get, You cannot use the shortcut of using the parameter name as a method. So a call like:

my $username = $obj->username( 'new_username' );

Will silently ignore any parameters that are passed and simply return the information as get() would.

id()

Returns the ID for this object. Checks in its config variable for the ID field and looks at the data there. If nothing is currently stored, you will get nothing back.

Note that we also create a subroutine in the namespace of the calling class so that future calls take place more quickly.

changed()

Retuns the current status of the data in this object, whether it has been changed or not.

has_change()

Sets the changed flag of this object to true.

clear_change()

Sets the changed flag of this object to false.

is_checking_fields()

Returns 1 if this object (and class) check to ensure that you use only the right fieldnames for an object, 0 if not.

timestamp()

Returns the value of the timestamp_field for this object, undef if the timestamp_field is not defined.

timestamp_compare( $ts_check )

Returns true if $ts_check matches what is in the object, false otherwise.

object_description()

Returns a hashref with three keys of information about a particular object:

url
  URL that will display this object

name
  Name of this general class of object (e.g., 'News')

title
  Title of this particular object (e.g., 'Man bites dog, film at 11')

ERROR HANDLING

(See SPOPS::Error for now -- more later!)

NOTES

There is an issue using these modules with Apache::StatINC along with the startup methodology that calls the class_initialize method of each class when a httpd child is first initialized. If you modify a module without stopping the webserver, the configuration variable in the class will not be initialized and you will inevitably get errors.

We might be able to get around this by having most of the configuration information as static class lexicals. But anything that depends on any information from the CONFIG variable in request (which is generally passed into the class_initialize call for each SPOPS implementation) will get hosed.

TO DO

Allow call to pass information to rulesets

Modify all calls to pre_fetch_action (etc.) to take a hashref of information that can be used by the ruleset. For instance, if I do not want an object indexed by the full-text ruleset (even though the class uses it), I could do:

eval { $obj->save( { full_text_skip => 1 } ) };

Objects composed of many records

An idea: Make this data item framework much like the one Brian Jepson discusses in Web Techniques:

http://www.webtechniques.com/archives/2000/03/jepson/

At least in terms of making each object unique (having an OID). Each object could then be simply a collection of table name plus ID name in the object table:

CREATE TABLE objects (
  oid        int not null,
  table_name varchar(30) not null,
  id         int not null,
  primary key( oid, table_name, id )
)

Then when you did:

my $oid  = 56712;
my $user = User->fetch( $oid );

It would first get the object composition information:

oid    table        id
===    =====        ==
56712  user         1625
56712  user_prefs   8172
56712  user_history 9102

And create the User object with information from all three tables.

Something to think about, anyway.

BUGS

COPYRIGHT

Copyright (c) 2000 intes.net, inc.. All rights reserved.

This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

MORE INFORMATION

Find out more about SPOPS -- current versions, updates, rants, ideas -- at:

http://www.openinteract.org/SPOPS/

AUTHORS

Chris Winters <chris@cwinters.com>

Christian Lemburg <clemburg@online-club.de> contributed some documentation and far too many good ideas to implement

Rusty Foster <rusty@kuro5hin.org> was also influential in the early days of this library.